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PROCEEDINGS FOR THE ICASE WORKSHOP 
ON HETEROGENEOUS BOUNDARY CONDITIONS* 


A. Louise Perkins 

Massachusetts Institute of Technology 
and 

Jeffrey S. Scroggs 
North Carolina State University 


ABSTRACT 

Domain Decomposition is a complex problem with many interesting aspects. The choice 
of decomposition can be made based on many different criteria, and the choice of interface 
of internal boundary conditions are numerous. Even more interesting from a modeling 
perspective is that the various regions under study may have different dynamical balances, 
indicating that different physical processes are dominating the flow in those regions. It may 
be desirable to use different numerical approximations in the regions where the physical 
processes are dominated by different balances. .. , 

The Institute for Computer Applications in Science and Engineering (ICASfi), recogniz- 
ing the need to more clearly define the nature of these complex problems, sponsored this 
workshop on Heterogeneous Boundary Conditions at the NASA Langley Research Center 
in Hampton, Virginia. This proceedings is an informal collection of the presentations and 
discussion groups. It also includes a bibliography that contains many of the references that 
discuss related topics. 

The proceedings begins with summaries of the discussion groups. Then papers describing 
the talks are presented. Lastly, the bibliography is included, and an index by subject is 
provided. 


♦This workshop was sponsored by the Institute for Computer Applications in Science and Engineering 
(ICASE) at the National Aeronautics and Space Administration (NASA) at the Langley Research Center, 
Hampton, VA 23665. 
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3 - ; Visual Artifacts 

Discussion Leader: Bill Gropp 
Argonne National Laboratory 
Report by: A. Louise Perkins 
Massachusetts Institute of Technology 

The discussion began with Dr. Bill Gropp introducing the concept of visual artifacts in 
numerical solutions. He presented examples of errors that appeared to be significant to the 
human eye, but that were well below the error criteria for the problem, and did not impact 
the quality of the numerical solution. 

The discussion then focused on defining a model problem where visual artifacts cbuld be 
examined explicitly. 

Model Problem \ 9=3 -T i 7 

A. Louise Perkins , / 

Massachusetts Institute of Technology 

jo?' 5 *"' 

1 Introduction 

At the ICASE Workshop on Heterogeneous Boundary Conditions a general optics problem 
that allows interference was suggested for study. The large-scale interference pattern that 
develops is quite sensitive to small perturbations in the boundary conditions. Hence it 
seemed ideal for testing and observing errors due to grid interface effects introduced by 
domain decomposition methods. Although the problem specification is somewhat arbitrary, 
it is necessary to be specific in order to compare results because it is expected that several 
researchers will explore this problem. 




2 Error Measurements 

The interference that we wish to test lends itself to error analysis using both a visual as well 
as more standard numerical acceptance criteria. The more standard numerical criteria are 

• propagation error 

• L\ error 

• Z-2 error 

• Loo error 

By propagation error we mean the phase difference between the computed location of the 
wave front and the exact location of the same wave front. 

Visually this problem gives rise to an interference pattern that can be compared for 
sharpness as well as location. We are interested in seeing these differences across the artificial 
interfaces introduced by the decomposition. 
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3 Motivation 


Dr. Bill Gropp suggested we examine a problem that had visual meaning in its errors, to 
allow studying the types of errors introduced between refined and coarse meshes at a more 
intuitive level. He suggested an optics interference problem. 


4 Optical Interference 

Fermats’ principle of least time is recast in Feynman et. al. [1] briefly from a quantum- 
dynamical perspective. By considering “rays of light” as photons, the ray path can be 
considered a sum of the individual paths of the photons. The ray path is then defined by 
the probability of each photon taking different paths. 

Upon encountering a barrier that contains a wide slit, a wave will continue through 
the slit almost undisturbed and geometric optics is a good model for ray behavior. But 
when the slit is sufficiently reduced in size, the choices for photon paths are truncated, and 
the probability distribution is altered, affecting the geometry of the wave front as it passes 
through the slit. 

This behavior is more easily understood by considering the simpler, less accurate, but 
more intuitive, Huygens principle which states that “all points on a wavefront can be con- 
sidered as point sources for the production of spherical secondary wavelets. After a time the 
new position of the wavefront will be the surface of tangency to these secondary wavelets”, 
as described in Halliday and Resnick [2]. Considering this simplified wave theory of light, a 
barrier in a wave path with a slit on the order of the wave length will cause diffraction. That 
is, the end points allowed to pass through the slit will no longer have symmetrical wavelets 
on either side, and they will bend at the ends. 

Placing two slits aside each other will replicate the limiting behavior twice, and the 
resulting wave patterns will interact, causing interference. The interference pattern will be 
visible, and dependent upon the original wave frequency. This interference pattern is quite 
sensitive to phase errors, so that the choice of grid sizes influences the solution behavior. 
This is the interesting aspect of this problem. 


5 The Interference Pattern 


Consider the wave equation 


d 2 u 

W 


= Au. 


Here A is the Laplacian. In the positive quadrant place two slits along the x-axis at locations 
X\ and X 2 , with 

d = x 2 — X\. 


Here x\ and X 2 are the mid-points of the two slits. Then for any point in the quadrant, 
P = (x p ,T/p), let di be the distance from x x to P , and d 2 be the distance from x 2 to P. 
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Waves will arrive at P out of phase due to the difference in the path lengths d\ and d<i* The 
maximum interference will occur when 

\d 2 — di\ = m\ 


where A is the original incident plane wavelength and m is a nonnegative integer. The 
minimum, of course, occurs at the half distances (m + |)* 

The size of our slits can now be specified. They should be at most of width A. A slit 
width less than or equal to the incident wavelength is sufficiently small to diffract the wave 
on a visible scale. This problem is interesting because generation of the large scale pattern 
depends on how accurately the small scale dynamics has been captured about the slits. 

» 

6 Geometry 

Here we assume that the problem has been normalized so that 

xi = 0 , 

x r = 1, 

Vb = 0, 
yt - 1 , 

We arbitrarily choose xi = \ - 5A, x 2 = | + 5A. This geometry is illustrated on the 
following page in Fig. 1. 

7 Initial Conditions 

The initial region, including all boundaries except the slits, should be quiescent (u = 0). 
Prescribe a plane wave described by the following function: 

2ir 

— V t) (7-1) 

/\ 

where A is the wavelength and v is the phase velocity. This impacts our domain along the 
slits [\ — 5.5A, | — 4.5A] and [i + 4.5A, | + 5.5A] of the x-axis for all time t > 0. Here, the 
equation reduces to 

sin^-(-vt). (7-2) 

A 

The phase velocity is known, allowing radiative boundary conditions to be defined. This 
is done in the section 9. 



8 Exact Solution 


The exact interference pattern is the superposition of the two waves. For any point P(x,y) 
in the domain, the travel time to P will be different from the two slits. Let the time from 
slit Xi be fj. Then the value at P will be the sum of the Pi where 

2tt « 

Pi(x,y) = sin—(~v(t - f,)) 

for (t-ti) > 0 and zero otherwise. We note that the distance from slit x { to the point P(x,y ) 

di = (( x - Xif +y 2 )h. 


9 Boundary Conditions 


All boundaries, except the slits which are prescribed with the incoming plane wave, evolve 
with the solution. Any workable boundary conditions can be applied on these boundaries, 

with the goal that these boundary conditions should influence the interference pattern as 
little as possible. 

For our prescribed incident plane wave we have the exact solution. However, prescrip- 
tion of these exact boundary values on our numerical approximation of the solution could 
cause the numerical approximation to degrade. Hence we recommend radiative boundary 
conditions because we know the phase velocity exactly and hope that the numerical phase 
velocity is quite close to the correct one. Then an open boundary condition can be used that 
advects the interference pattern out of the domain by advancing the wave equation, 


da du 

m +v a^-° 


where n is the direction normal to the boundary. 


10 Domain Decomposition Method 

Our purpose in examining this test problem is to measure directly the effect of mesh refine- 
ment and the resulting mesh interfaces on a known wave that is sensitive to phase errors, 
while concurrently being able to visually display a meaningful picture of the effects of the 
refinement-induced error on the solution. 

The Coarse mesh must be able to adequately represent the interference pattern for the 
visual comparisons. It should be no larger than Ax ~ Ay - and may need to be smaller. 
The workshop suggested We SU gg es t that £ an d £ all be tested. However, these 
values are only apriori suggestions as we have not yet worked this problem. 

The refined mesh must be able to adequately capture the diffraction behavior, s'o that 
the plane wave front bends ’ as it passes through the slit. Given that the coarse and refined 
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meshes are sufficiently accurate, the phase errors introduced during the problem solution will 
be a function of the sound speed on the two grids plus the coarse/refined grid interaction 
errors. 

Due to the geometry of the problem, non-adaptive local uniform mesh refinement is 
adequate for the domain decomposition. 


11 Discretization 

We suggest a second order in space and time Leap Frog/Hopscotch discretization method 
should be used. We anticipate beginning with A = 0.1, and A = 0.05. 


12 Analysis 

We expect that calculations for boundary errors and the ratio of mesh refinements will be 
analyzed. These can be done both analytically and by comparison to an everywhere fine 
mesh. 


References 

[1] R.P. Feynman, R.B. Leighton, and M. Sands, The Feynman Lectures On Physics Mainly 
Mechanics, Radiation, and Heat, Addison- Wesley, (26-8), 1963. 

[2] D. Halliday, and R. Resnick, Fundamentals of Physics , John Wiley and Sons, 1970. 
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Fictitious Domain methods are constructed in the following manner: Suppose a partial 
differential equation is to be solved on an open bounded set ft in two or three dimensions. 

Let 72 be a rectangular domain containing the closure of f 2. The partial differential equation 
is first solved on R. Using the solution on 72, the solution of the equation on ft is then 
recovered by some procedure. 

The advantage of the fictitious domain method is. that in many cases the solution of a 
partial differential equation on a rectangular region is easier to compute than on a non- 
rectangular region. Not only do more accurate algorithms exist for rectangular regions but 
they are also more computationally efficient. The difficulty in the method, of course, is the 
procedure that is used to tie the “global” solution on 72 to the “local” solution on ft. This 
is generally where the inefficiencies of the method creep in and where most of the current 
research on the method is being done. A classic application of a fictitious domain method 
is the computation of the solution of an elliptic partial differential equation on a general 
region. Here, the global solution on a rectangular region can be computed by a fast Poisson 
solver. These solvers are quite effecient for rectangular regions but not for other geometries. 

The global solution is tied to the local solution on the general region via ideas in capcitance 
matrix techniques. A discusion of this approach is given in [1]. Other uses of fictitious 
domain methods are given in the remaining references. 

Fictitious domain methods for solving elliptic PDEs on general regions are also very ef- 
ficient when used on a parallel computer. The reason for this is that one can use the many 
domain decomposition methods that are available for solving the PDE on the fictitious rect- 
angular region. In domain decomposition methods, the global rectangle is decomposed into 
sub-rectangles of equal size and elliptic PDEs are solved on each sub-rectangle to provide 
approximations to the elliptic PDE on the global rectangle This process is iterated upon 
several times to get successively better approximations, see references in [4]. This is signifi- 
cant in that the approximations on the sub-rectangles can often be computed simultaneously 
and, thus, can be carried out in parallel on the individual processors of a multiprocessor. 
Moreover, because the approximations to the global equation are furnished by the solu- 
tions of PDEs on sub-rectangles of equal size, they can be calculated in the same*amount 
of time. This is advantageous because the load balancing and synchronization overhead in- 
curred from managing the tasks of computing the approximations on the processors becomes 
almost non-existent. 

The discussion on fictitious domain methods began with a short talk by Roland Glowinski 
where he gave some examples of a variational approach to fictitious domain methods for 
solving the Helmholtz and Navier-Stokes equations. After Glowinski’s introduction to the 
subject, the question of the usefulness of the method was tossed out to the audience. A 
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comment was made that the fictious domain method seems to be the only reasonable way of 
solving three-dimensional partial differential equations on general domains. This statement 
was justified by the admission by several people that the implementations of standard finite 
element or finite difference techniques to 3-D problems is a horrendous task. Theje was a 
considerable amount of discussion on this point but, in the end, the audience agreed that 
the fictitious domain method is a viable approach to three-dimensional problems and more 
research is needed in this area. 


References 
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Abstract 

We test the performance of five different interface preconditionings for domain-decomposed 
convection-diffusion problems, including a novel one known as the spectral probe, while varying 
mesh parameter, Reynolds number, ratio of subdomain diffusion coefficients, and domain aspect 
ratio. The preconditioners are representative of the range of practically computable possibilities 
that have appeared in the domain decomposition literature for the treatment of nonoverlapping 
subdomains. We demonstrate through a large number of numerical examples that no single precon- 
ditioncr can be considered uniformly superior or uniformly inferior to the rest, but that knowledge 
of particulars, including the shape and strength of the convection, is important in selecting among 
them in a given problem. 

1, Introduction 

The solution of linearized convection-diffusion equations of the form 

V</>- V-cV<£ = /, (1.1) 


where <f> is a conserved quantity (energy, mass fraction, momentum component, etc.) transported 
under the influence of velocity field c and diffusi vity c is required throughout computational physics. 
Discretization by finite differences or finite elements results in a large sparse system of algebraic 
equations whose solution can be demanding in computational resources and is one of the many 
driving forces for parallel computation. Because the strength of coupling between a pair of dis- 
crete unknowns governed by an equation like (1.1) decays with physical separation (more or less 
isotropically depending upon c), it is natural to partition the problem spatially when looking for 


concurrency in the solution algorithm. Parallelism is, in fact, only one of several compelling reasons 
for the recent surge of research on domain decomposition algorithms exemplified by the volumes [9, 
10, 17]. Others include a powerful theory describing optimal or near-optimal algebraic convergence 
rates for hierarchical preconditioners of domain-decomposed type, the convenience of composite ar- 
ray data structures for describing complex shapes, a desire to employ solution techniques and quality 

-f- . . 5 * 

/ fThis report is an augmentation of a report entitled Interface Preconditioning Jor Domain- Decomposed Convection- Diffusion 
/ Operators by the first and third authors that appeared in "Proceedings of the Third International Symposium on Domain _ 
/ Decomposition Methods,” T. F. Chan, R. Glowins ki, J, Perianx, aneL, Q- B.Widlund, eds. , pp. 245-262, SIAM, 199CK 


nDepartuielil of IvTitheinat icsf UC LA , Los Angeles, CA 90024. The work of tills author was supported in paqJTby NSF ASC 
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* * Mathei hatTcs aiul C oni piiTer Science Division™ Argonne National Laboratory, Argonne, IL 60439. The work of this author 
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of Energy, under Contact W-3 1- 1 U 9 -ENJ ^j3&c^ - 
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software restricted to problems witli various local uniformity requirements (which are subproblems 
with regard to (1.1)), and sheer problem size, which can ultimately push numerical ill-conditioning 
and serial memory traffic beyond acceptable limits. 

Preconditionings for interfacial degrees of freedom have been the focus of much attention dur- 
ing the development of domain decomposition methods, and deservedly so, since interfaces of lower 
dimension than the original domain of definition of the partial differential equation are created by 
a predominant form of nonoverlapping decomposition related to nested dissection of the underlying 
finite difference or finite element matrix operator. We refer generically to such forms of domain 
decomposition as Scliur iteration, since elimination of the subdomain interiors leaves a Schur com- 
plement system for the separator unknowns. Additional interest in interface preconditioning comes 
from the fact that the classical Schwarz iteration, the prototype for overlapping decompositions, 
has been placed into correspondence with a stationary iteration having as unknowns the interfa- 
cial degrees of freedom of a nonovcrlapping decomposition [6, 11]. This correspondence between 
Schwarz and Schur methods enriches the study of domain decomposition algorithms in general, 
because properties which are more easily analyzed in one framework may be extended to the other. 

The present contribution focuses on the performance of a variety of easily computed Schur 
complement preconditioners in a rather special context: a single interface dividing a rectangle into 
two subrectangles in which the capability of performing exact solves is presumed. We consider a 
scalar convection -diffusion operator under a uniform or “terraced” diffusion coefficient and a variety 
of continuity-satisfying flow fields chosen to exhibit the relative advantages and disadvantages of 
the preconditioners. The pristine nature of the problem class allows focusing on the quality of the 
interfacial preconditioning, alone, in four different limits: large discrete problem size, large Reynolds 
(or Peclet) number, large diffusion coefficient ratio, and large aspect ratio. (The Reynolds number 
is the dimensionless ratio cl/e, where c is a characteristic velocity, / a characteristic length, and l a 
characteristic diffusivity. Large values characterize strongly nonsymmetric, convectively-dominated 
systems.) Any or all of these limits could be important in a production engineering code whose 
parallelization might be sought through domain decomposition. We show that no single interface 
preconditioner is best in all limits, and therefore seek to qualitatively rank their sensitivities to 
these limits and identify realms of superiority. 

Several different coefficient fields c and c are studied because the performance of all of the 
preconditioners are sensitive to them and unjustified optimism or pessimism can result from too 
narrow a study. Two of our five preconditioners have been amply studied previously in the sym- 
metric positive definite context of pure diffusion. There have been very few studies of any of them 
in the convection-diffusion context, and since this case is also relatively untouched by theoretical 
approaches, apart from spatially invariant velocity distributions, numerical studies are continuing 
to yield interesting information. — 

Wc comment briefly on a few other issues which bear on our choice of scope. It is possible to 
set up an alternative framework for nonovcrlapping decompositions in which interfacial coupling 
is simply discarded, or partially accounted for in ways that do not require special treatment of a 
separator set; see, c.g. y [l] and [26]. In so doing one obtains the advantages of greatly simplified 
coding and less inter-domain data traffic per iteration. Problems dominated by local interactions 
can be handled quite acceptably by decoupling; see c.g [23]. However, in problems which are 
diffusively dominated (more fundamentally, problems whose Green’s functions have support which 
is not substantially confined within artificial subdomain boundaries), such approaches haye limited 
applicability to large numbers of gridpoinls and/or subdomains. . 

The special case of a single interface obviates discussion of preconditioning the set of vertices 
where multiple interfaces intersect. Vertex preconditioning is very important but also more readily 
prescribable, at least in two dimensions. A coarse grid problem for the vertices having the same 
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structure as the undecoinposed original problem can be derived directly from the differential op- 
erator by employing a hierarchical basis discretization. The interface system, on the other hand, 
corresponds to a pseudo- differential operator, the numerical analysis of which is relatively less well 
developed in the presence of convective terms. In a preconditioner consisting of component blocks 
corresponding to subdomains, vertices, and interfacial edges (and interfacial planes in three dimen- 
sions), any one block can limit the overall performance. A study of interface preconditioning is 
thus necessary, but not sufficient, for guiding the construction of complete preconditioners. 

Finally, as to the relevance of our scope, we note that practical problems often involve several 
simultaneous convection-difTusion operators linked through coefficients or source terms. Continued 
study of the scalar case is, however well motivated by techniques such as the alternating block 
factorization [4] which successfully employ scalar preconditioners inside of a change of dependent 
variables which partially decouples the original system. 

The algorithmic framework of our experiments is described in Section 2, followed by intro- 
duction of the five interface preconditioners and a brief discussion of their properties in Section 3. 
Section 4 contains performance measurements in the form of iteration counts along several axes of 
problem parameter space. We draw some conclusions and recommendations in Section 5. 


2. Schur Domain Decomposition Methods 

We take as our starting point the matrix equation Ax = b arising from a finite difference dis- 
cretization of of (1.1). The domain decomposition method we employ is an iterative substructuring 
method consisting of three elements: (1) the operator A whose inverse action we would like to 
compute with an accuracy commensurate with the discretization, (2) an approximation B to A y 
whose inverse action is computationally convenient to compute, and (3) an acceleration scheme for 
the preconditioned system which requires only the ability to form the actions of A and B~ l on a 
vector. In all cases reported herein, A is derived from a second-order central differencing of the 
diffusion term and a first-order upwind differencing of the convection term. Extensions to second- 
order upwind differencing have been carried out in, for instance, [27]. We use right-preconditioned 
GMRES [30] as our iterative acceleration scheme, that is, we solve AB~ l y = b by the applying the 
standard GMRES algorithm to (AB~~ l ) then recover x through the solution of Bx = y . 

GMRES is guaranteed to converge in a finite number of steps for nonsingular AB~ X even in 
the presence of nonsymmetry or indefiniteness, assuming exact arithmetic. The maximum number 
of steps required is the number of distinct eigenvalues of the preconditioned operator. This con- 
vergence result depends upon dynamically storing a complete basis for the Krylov space built from 
powers of AB~ l acting on the initial residual vector. For large problems, this much memory can 
easily become excessive, and GMRES is often truncated or restarted [30] in cases where it does 
not converge within a predetermined number of steps. However, wc allow full GMRES iteration 
in our experiments, up to some maximum number of steps (set at 30 herein) which is sufficient in 
all but two cases. Because fewer than 30 steps are almost always sufficient, we effectively suppress 
from consideration the restart or truncation parameter. This parameter can be important in a 
“production” setting. 

The substructuring enters through the manipulation of A and B into forms which possess 
large block zeros, for the sake of concurrency or for some of the other reasons noted in the intro- 
duction. For elliptic operators such as (1.1), A is irreducible; hence there are no block triangular 
permutations. However, if the domain is cut by the removal of a swath of gridpoints as wide as the 
semi-bandwidth of the stencil, two large subprobleins are created whose only coupling is through 
the small removed set. For five-point stencils on logically tensor product grids, we may choose 
a single row or column of unknowns. (A two-point-wide generalization has been studied for the 
thirteen-point biharmonic stencil in [8].) Ordering the separators last, we obtain 


( 2 . 1 ) 


/-'hi 0 A^X / x i \ / \ 

Ax = I 0 A 22 A 2-j ) I *2 1 = ( 6a I s 6- 

\ A31 A32 A33 / \ x 3 ) \b'A ) 

Here, An and A22 are five-point operators with bandwidth no larger than that of the naturally 
ordered original system, but A33, which renders the coupling between the points on the interface 
itself, is tridiagonal. The other blocks contain the coupling of the separator unknowns to the 
subdomains, and vice versa. From the point of view of the continuous operator they represent 
derivatives in directions .normal to the interface. 

Block Gaussian elimination of the unknowns xi and X2 would yield the Schur complement 


system 





Cx 3 

= d. 

(2.2) 

for Z3, where 





C = A33 — A 3 1 A [ 

A 13 — A 33 A 2 2 A 23 

( 2 . 3 ) 

and 





d = 63 - A3] Ay j 1 

bi- A32A.J2 62 • 

( 2 . 4 ) 


If X3 can be found, the subdomain problems are decoupled. However, direct computation of the 
generally dense C in order to solve ( 2 . 2 ) requires as many pairs of exact subdomain solves as there 
are degrees of freedom in 3:3, which is generally prohibitive. It is also unnecessary inasmuch as 
iterative techniques have been devised which require many fewer iterations than the dimension 
of X3, and which furthermore require only approximate subdomain solves in each iteration. As 
mentioned already, we shall ignore the option of inexact subdomain solves in the sequel, effectively 
reducing the iterations to the interface, but we nevertheless make use of a gen era l purpose code 
which retains the interior degrees of freedom in carrying out the numerical experiments. 

Wo consider two families of preconditioners D , the structurally symmetric 

(A n 

lh = I 0 

\ A31 

(An 0 
= 0 An 

\ A31 Asi 

where M approximates the Schur complement C ( 2 . 3 ) of An and A 22 in A , and the simpler block 
triangular 

(An 0 A r A 
«2 = « A 22 A; 3 • 

\ 0 0 M ) ^ ^ ... 

The factorized form of above shows that the cost of applying the inverse of B\ is one solve 
with M and two solves each with A n and An- There is an inherent sequentiality to the subdomain 
solves, however, since the system involving M in the left factor requires data from the first set of 
subdomain solves. The inverse of D2 can be applied to a vector at the cost of solving one system 
each with M, An, and An- The system for M is solved first, followed by independent solves in 
the subdomains which use the interface values as boundary conditions,^ : t‘ . -- : 

We assume throughout that the A„ are invertible. (This is certainly a reasonable requirement 
for a discrete convective-diffusive operator and is guaranteed herein for all Reynolds numbers by 
upwind differencing.) Under this assumption, C is also invertible [ 15 ]. ; 
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/ 0 A u Aj 3 
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0 0 I 


M + A 3 1 A , / A 13 + A32A22 A23 


-1 


For matrices arising from standard quasi-uniform finite element discretizations of elliptic partial 
differential equations, A lias a condition number of C>(/*“ 2 ), whereas C has a condition number of 
0{h~ A ) [5, 29]. The equivalence of conjugate gradient iterations on the Schiir complement system 
with preconditioner M and on the full mbstructurcd matrix A with preconditioner B\ was shown 
in [24], 

For reference in Section 4, it is interesting to note the forms of the preconditioned operators 
AB^ 1 , and AB^ 1 . In order to make the formulae more readable, we combine the independent 
subdomain solves into a block matrix /1 q, and denote the separator block by Ar, to re-express the 
above matrices as 


A _ | ^or \ f > _ ( Mi 

/ti n /t[' /’ 1 V /ti n M + A i 


■^nr \ n _ (An Anr\ 

■mA^Aur)' 2 ~ V 0 M )' 


whence 


and 


i __ ( Aq + Aq /lor A/ /Ipo^o 


' -M-hivuAj 


iu l A uv M-'\ 

M~ } ) 


B 


-i _ I y tji - A 


2 ' 0 

From these expressions it can easily be verified that 


t^/tnr-M i\ 

M~ l ) ' 


AB- X 


i-l 1 

(I - CM~ l )AinAft l CM 


S-‘) “ lld AB i‘ = (/irons' 


CM- 


(2.5) 


It is evident that if C is exactly represented by M, then AB^ 1 reduces to the identity, and 
an iteration involving /1/ij" 1 will converge in one step requiring two sets of subdomain solves. 
An iteration involving A 77.J 1 , on the other hand, will converge in two steps (from an arbitrary 
initial guess) if M — C , but each step requires only one set of subdomain solves. (These iteration 
counts do not include the filial solve with either B\ or B 2 which is required to unwind the right- 
preconditioning.) More generally, if M is sufficiently close to C in the sense that the lower-left block 
of the structurally symmetric system is small, ||(7 - CM ~ 1 ) || « 1, we expect that an iteration 
based on B 2 will require an extra iteration relative to an iteration based on B\. Conversely, if M is 
a poor preconditioner for C, so that the lower left block becomes large, the use of the structurally 
symmetric system could require more iterations than the use of the block triangular system. Both 
behaviors arc illustrated in Section 4. 

Note from (2.5) that AB^ X and AB 2 l have identical spectra, as Arnoldi estimates for the 
eigenvalues obtained as a by-product of the GMRES iterations also show. However, Krylov se- 
quences based on the respective operators will in general differ, and there is little that can be said 
about which method will lead to faster convergence for general c if M and C are not sufficiently 
close. 

For some of the precondilioner components M we consider, the overall preconditioning process 
is numerical unstable, as will be seen in Section 4. Even though the iterations involving AB~ l may 
converge, the final result after unwinding the preconditioning may have few or even no significant 
figures. For this reason, we always check the actual residual ||/- Ai|| at the end of each calculation. 


3. Scliur Interface Preconditioners • 

We proceed to delineate five alternatives for the matrix M . 

3.1. Interface Probe Preconditioner 

Interface probe preconditioning is a family of methods for approximating the true Schur com- 
plement C defined in (2.3) by low bandwidth matrices. We use the nomenclature IP(fc) to denote 
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the approximation sequence M = A? - Ek, k = 0, 1 , 2, . . where £7^ is a matrix of semi-bandwidth 
A; which produces the same action as /Irn^Q 1 zlfip on a set of 2k + 1 test vectors. Note that when A 
arises from a five- point finite-difference discretization both the IP(0) and IP( 1) preconditiohers are 
tridiagonal because Anr is. As k increases beyond 1, M acquires additional diagonals. Selection 
of test vectors of appropriate sparsity patterns enables the coefficients of Z7* to be read directly ofT 
of the product involving hence the term “probe.” We report only on the row-sum 

conserving IP(0) herein. IP(1) is only rarely more cost effective than IP(0) over the range of non- ' 
symmetric scalar five-point stencil problems studied herein, and a law of diminishing returns sets 
in as fc is increased, 

IP(0) was invented independently by Chan and Eiscnstat in 1985, immediately generalized to 
IP (lc) in [14], and adopted for variable coefficient symmetric problems in [24] (where it was called 
the “modified Schur complement” method) and for nonsymmetric problems in [25, 26], Symmetric 
versions of IP(0) and 1P(1) have also been employed in [2, 3], Many algebraic and spectral properties 
of banded and circulant probe preconditioners are derived in [13]. The interface probe technique 
has the advantage of being purely algebraic in character, and hence capable of being defined for 
arbitrary operators. It is aesthetically pleasing that the tunable parameter k may be taken from 
the crude approximation of 0 all the way to the full bandwidth exact solution. It has also been 
generalized in a straightforward way to multicomponent systems [26]. However, IP(fc) for low k 
is not expected to be particularly useful for arbitrary matrices. The low k limit is motivated by 
the observation that the elements of C decay rapidly away from the diagonal for elliptic problems. 
In sufficiently simple elliptic problems (c. jy M those possessing constant coefficients) preconditioners 
described below taking better advantage of this structure are also possible, leaving IP(A;) large 
but not unlimited regions of problem parameter space in which to exercise. Interface probing 
has the advantage of being automatically adaptive to spatial variation in the coefficients but the 
disadvantage of not possessing the property of spectral equivalence, a consequence of which is that 
it degrades as the mesh is refined. Experimentally [24], the condition number of the preconditioned 
Schur complement system for the Laplacian goes like /i -1 / 2 , and this bound is conjectured to be 
the best attainable for any tridiagonal matrix based on experiments with an optimization code in 
[20]. An bound is proved for a circulant probe-preconditioned system with periodic boundary 
conditions on the boundaries normal to the interface in [13]. 

3.2. Spectral Preconditioner 

The spectral preconditioner is an exact eigendccomposition of a single interface, rectangular 
domain, constant coefficient convection-diffusion operator described in [12] as a generalization of [7]. 
We consider only the Dirichlet case herein, but generalizations to Neumann boundary conditions are 
straightforward. Let an interface of n interior nodes (i.c., h~ l = n + 1) separate two subdomains of 
the same discrete length, and discrete heights rn\ and m 2 , respectively, over all of which is satisfied 
the difference equation 


(IX i— IJ T J T CX {. flj T -j- €Xi t j — \ — fi,j y (3.1) 

where i denotes the free index alon<j the interface. We may write M = DW AW~ l D~ l , where W 
is the discrete sine transform of length n with matrix representation 

[VV 7 ],j = \/2/tsiu ijirh, (3-2) 

D is the diagonal matrix with elements 

l«l. • (2) (i ,,/ \ ' ' (3-3) 
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Figure 1: Modes of the Dirichlet problem (3.1) for n = 15. 
(a) |«/c| = 1, j = 1; (b) \a/c\ = 1.21, j = 1; (c) |a/c| = 1, 
j = 8; (d) |a/c| = 1.21, j = 8; (c) \a/c\ = 1, j = 15; (f) 
\a/c\ = 1.21, j = 15. (The left-hand column of modes are for 
the case of no tangential convection.) 


and A is a diagonal matrix with elements 


[A], = A 


1 / 1 + ll 


mi + 1 


1-7- 


m i+l 


+ 


1 _ yn 2 +i j \/[*+ v/oc( 2- CT,)] 2 -4de, 


(3.4) 


where, in turn, 



The derivation of these formulae (see [12] for full details) begins with the observation that the 
columns of the matrix ( DW ) are the eigenvectors of the tridiagonal matrix formed by the coefficients 
along the interface, viz., tridiag(u,h,c). Sample such modes are plotted in Figure 1 for two different 
values of the ratio |u/c| corresponding to zero and constant nonzero tangential components of the 
convection. The nonvauishing first-derivative convection term has the effect of multiplying the 
sinusoids by an exponential. 

The philosophy of using the spectral preconditioner for arbitrary interfacial systems is that 
of solving an approximate (constant coefficient) problem exactly, rather than an exact (general 
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coefficient) problem approximately. One of its advantages is that it can be defined without requiring 
the ability to solve problems in adjacent subdomains, as required by the interface probe technique. 
All that is needed is some averaging rule to obtain the coefficients a through e from the data of 
the associated regions. All our tests herein employ a simple average of the coefficients along the 
interface alone. Another advantage is its automatic adaptivity to domain aspect ratio, since the 
boundary conditions are built into the derivation. We note that application of M~ l is inexpensive: 
two one-dimensional FFTs sandwiched between three diagonal matrix multiplications. 

3.3. Spectral Probe Preconditioner 

The spectral probe proconditioncr, introduced here for the first time, is conceptually a hybrid 
of the interface probe and the spectral preconditioners. Spectral probing assumes a form for the 
eigenvectors of C like that derived for the constant coefficient operator of the previous subsection 
(again based on spatially averaged coefficients), but then populates the diagonal matrix A by 
probing the true Schur complement, so that some spatial adaptivity is accommodated within a 
spectrally equivalent framework. 

We set M = DWAW~ l D~ ] where W and D are defined as above (or where D is alternatively 
simply set to the identity matrix, corresponding lo a — c, for reasons which will become clear in 
Section 4). A is then determined by probing with the interface vector of all l’s. This is the same as 
the standard test vector for IP(0). To be explicit, we read olT the elements of A from the equation 

W~ l D~ l CDW * 1 = A* 1. 

The action of C is computed by means of a pair of subdomain solves using DW * 1 as the interface 
boundary condition. Note that the spectral probe preconditioner reduces to the spectral precon- 
ditioner in the constant coefficient Dirichlet case, because then C is exactly diagonalized by the 
given similarity transform. 

3.4. Laplacian Square-root Preconditioner 

As a base-line reference, and because it appears throughout the literature, we include tests 
with a method based on the square-root of the one-dimensional Laplacian operator, easily written 
as: 

M = WAW-\ 

where A is now the diagonal matrix with elements [A ]j = 2 y/ai. We sometimes denote this oper- 
ator as Dryja’s preconditioner because of its popularization in this context in [16]. More general 
discrete antecedants were considered in [18]. It is difficult to pinpoint the discovery of the spectral 
equivalence of this operator to the Schur complement of the Laplacian, since the continuous analog 
of this equivalence has been known for some time. This preconditioner is expected to be good in 
diffusion-dominated problems, or in the discrete limit h 0. 

Note that this preconditioncr is distinct from the spectral preconditioner (§3.2) for the Lapla- 
cian. Dryja’s preconditioner achieves a constant bound on the number of iterations as the mesh is 
refined, but the constant is generally higher than that achievable with the coefficient and aspect 
ratio adaptability of the previous two techniques. The literature also records two important precon- 
ditioners intermediate between the Dryja and spectral techniques, namely [19] and [5]. The latter, 
the Neumann-Dirichlet preconditioner, contains some of the adaptive capabilities of the spectral 
preconditioner since it relies on subdomain solves in its construction and hence contains much coeffi- 
cient information. It is similar to probing techniques in this regard. In fact, the Neumann-Dirichlet 
precondi tion er is exact in problems possessing symmetry across the separator set. All four of the 
techniques of [5, 7, 16, 19] were tested in [24], but for brevity we test only the extremes here. 

3.5. Tangential Preconditioner 

Finally, we consider a simple preconditioncr possessing partial adaptivity, a lower-dimensional 
restriction of the operator to the interface created by setting all of the normal derivative terms in 
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the operator to zero and retaining just the remainder in M. For (1.1) these are just the tangential 
derivative terms. The obvious motivation for this technique is that it is simple and is expected 
to work well in the limit of strong convection along the interface, a limit which turns out to be 
troublesome for the spectral and spectral probe preconditioners. In addition, its very satisfactory 
behavior in the multidomain experiments in [21] recommend it. For reasons not yet theoretically 
explained, it performs very well in conjunction with the block triangular form of the the overall 
precon dilioner described in Section 2. A minor disadvantage is its requirement of partial knowledge 
of the differential operator, rather than simply the elements of the discrete operator A* To be 
specific, it is necessary to store separately the contributions to A arising from the normal derivative 
terms, and all other terms. , 

4. Numerical Experiments 

All of the experiments to follow except for those of Table 14 are posed on the unit square (/ = 1 
in the definition of the Reynolds number, lie) with homogeneous Dirichlet boundary conditions. 
The five different continuity-satisfying flow fields tested are shown in Figure 2(b)-(f) , along with 
a purely diffusive baseline case (Figure 2(a)). When Reynolds numbers are reported below for the 
variable coefficient cases, they arc always based on the maximum velocity in the region. (See [28] 
for details on the jet and cell flows and other experiments on this particular problem set.) The 
interface divides the rectangle into equal upper and lower portions, as marked on the figure in the 
dashed line. In addition to cases with constant diffusion, we study in §4.3 a convectionless case 
with piecewise constant, but disparate, difTiisivitics on either side of the interface. 

There is a constant source term of unit strength in the interior. Although it is special, a zero 
initial guess for the solution vector is employed throughout, since this will usually be the natural 
choice when (1.1) arises for a Newton increment, as part of an outer nonlinear iteration. The 
performance of the preconditioners is measured by the number of iterations required to reduce the 
initial residual by a factor of 10~ 5 , regardless of the mesh resolution. The tables of iteration counts 
arc grouped by subsections into four sets of experiments. 

4.1. Sensitivity to Mesh Refinement 

Tables 1 through G examine a constant Re situation as the (uniform) mesh is refined by three 
successive powers of 2. Of course, the discrete diffusion term, the Laplacian, becomes more and 
more dominant with each refinement of the grid, since it scales as h~ 2 as compared with the h~ l 
scaling of the convection term. This is the asymptotic limit for which D and its relatives S and 
SP are designed. In the first table, the Laplacian is studied in isolation (Re = 0). In the next 
five convective cases, Re = 1G. For the coarsest mesh (/t -1 = 8), the contributions to the diagonal 
of the discrete operator from the two terms are equal at this Reynolds number (the cell Reynolds 
number, R.e c = ch/e , is 2). 
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Table 1: Iteration counts for the pure diffusion problem as a 
function of mesh parameter for two different preconditioner 
structures and five different interface blocks. 
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Figure 2: Streamfunctioa contour plots of the two-dimen- 
sional flow fields represented by c in the numerical experi- 
ments. (a) Pure Diffusion; (b) Normal Convection; (c) Tan- 
gential Convection; (d) Skew Convection; (e) Jet Convection 
(the domain is the right half of a symmetric flow field); (f) 

Cell Convection. 

The S (spectral), SP (spectral probe), and I) (Dryja) columns of Table 1 reveal their exactness 
or spectral equivalence, respectively. Because iteration count is a threshold measurement, most of 
the data is subject to ±1 perturbation upon modest adjustment of the convergence tolerances, but 
the S and SP residuals at convergence are zero to machine precision. The deterioration of IP like 
some negative power of h is evident on both the structurally symmetric (B\) and block triangular 
(# 2 ) sides of the tabic. The tangential preconditioner is the only one with markedly different 
performance depending upon the structure of B . Here, as below, it is excellent in conjunction with 
the block triangular fofffi" ' " ; - : 



Table 2; Same as Table 1, except for normal convection at « 

a Reynolds number of 16. 


Table 2, for a normal convection problem, is similar to Table 1 except that IP improves slightly 
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as each of the terms A M A^ { [ A ^ and /1 32 AJ 2 * A 2 3 being approximated by a diagonal matrix becomes 
less important relative to /I 33 because one of the coupling matrices is small. For instance, if the 
convection is from subdomain 1 into subdomain 2 , A^ and A 32 are weak. 
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Table 3: Same as Table 1, except for tangential convection 
at a Reynolds number of 16. 


The importance of the D matrix in the spectral preconditioner is evident in Table 3 in which a 
tangential convection problem is considered. The version of SP employed in this study approximates 
the D in its definition as the identity; using the true D here would reproduce the spectral results 
in this constant coefficient case, just as in the previous two tables in which D = I anyway. Though 
SP and D are spectrally equivalent, they require an order of magnitude more iterations than S, and 
are surpassed by IP in the smaller problem range on the structurally symmetric side, and by the 
tangential preconditioner on the block triangular side. 
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Table 4: Same as Table i, except for skew convection at a 
Reynolds number of 16. 


Table 4 contains the last of the constant coefficient test examples, featuring a skew convection 
(inclined at 45 degrees relative to the interface). Overall results are not very different from the 
purely tangential convection case. Most of the entries In the skew table lie at or between cor- 
responding entries in the normal and tangential tables. Those of the tangential preconditioner, 
however, are often worse in the intermediate skew convection case than at either extreme. 
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Table 5? Same as Table 1, except for jet convection at a 
Reynolds number of 16. 
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The jet case recorded in Table 5 tends io level the preconditioner landscape because the 
constant coefficient approximation of S is no longer exact. S remains the best technique as h~ l 
increases, but its margin of superiority over other spectrally equivalent techniques (with D as a 
nonadaptivc extreme) is small. 
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Table 6: Same as Table 1, except for cell convection at a 
Reynolds number of 16. 


The cell case of Table G is the greatest equalizer among the test cases, because the interface 
cuts a zone of recirculation, i.e., there is normal flow across it in both directions, and none of the 
methods holds an edge of superiority. Performance under the block tridiagonal preconditioner is 
notably uniform for the hist four methods. 

4*2. Sensitivity to Reynolds Number 

Tables 7 through 11 examine the influence of increasing Reynolds number. Values of Re of 
0, 4, 16, 64, 256, and 1024 are considered at Ir ] = 64. Thus, the third row of each table in this 
subsection is the same as the last row of the table of corresponding flow type in the first set,*and the 
first row of each table is the same as the last row of the pure diffusion case in Table 1. Progressing 
down the rows of the table the nonsymmetry of the operator increases. Between row»four and 
five the convection terms begin to contribute more heavily than the diffusion terms to the diagonal 
elements of A. 
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Table 7: Iteration counts for the normal convection prob- 
lem at a constant mesh parameter of 1/64 as a function of 
Reynolds number for two different preconditioner structures 
and five different interface blocks. 

Table 7 shows that in the presence of constant normal convection, techniques S and SP remain 
exact at any Re, while IP catches up at high Re, and D and T successively worsen. (For D, this 
is the drawback of finite h in a method derived for h — ► 0.) Qualitatively, the trends are the same 
for either preconditioner structure, although the tangential method continues to be much better in 
the block triangular formulation. 

The Achilles’ heel of the spectral technique appears when there is strong convection tangential 
to the interface, as seen in Table 8. In this limit in which | ajc\ differs sufficiently from unity the 
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Table 8: Same as Table 7, except for tangential convection. 
The hyphen denotes loss of precision, and the “>” more than 
30 iterations. 


latter terms in Z), which have this ratio raised to as much as (n - l) st power, can approach the 
machine epsilon (or its reciprocal, depending upon the direction of the convection). In reference to 
(3.3), we note that for n = 64, (tt/c)^' 1 )/ 2 ^ c lnAC u % 10“ 16 when (a/c) « 10 (~ 32 / 63 ) r* 0.31. Under 
upwind differencing, it only takes a cell Reynolds number of 2 to produce a ratio of 3 between the 
upwind and downwind stencil coefficients a and c. Thus, Re c = 2 is the borderline of stability 
for the spectral preconditioner with respect to tangential convection. In the tables, the Re = 64 
row corresponds to a cell Reynolds number of unity, and the Re = 256 row to a cell Reynolds 
number of 4; thus, they straddle the transition. GMRES iterations based on the spectral interface 
preconditioner do terminate for the hyphenated entries, but the residuals based on the resulting x 
vectors shows complete loss of precision. 

The spectral probe technique docs not lose precision, because of the assumption that D — 7; 
however, a diagonal approximation for W~ l CW is poor, and it simply takes too long to converge. 
The Dryja preconditioner, which makes no attempt to adapt to the strong directionality of the 
problem also deteriorates with Re. At low Re where M and C are close, the B\ iteration count 
is lower by one; but at high Re, where isotropic M is very different from unidirectional C, the B 2 
iteration count is better. The interface probe technique meanwhile improves with Re as it captures 
more of the increasingly diagonally dominant problem within its own sparsity structure. Finally, the 
tangential technique is excellent for a tangentially dominated operator. The cross-diffusion which 
it neglects becomes of negligible importance as the problem effectively decouples into independent 
problems for An, ^22? and A33 in which the upstream boundary condition is all important. 
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Table 9: Same as Table 7, except for skew convection. 


Most of the observations of the high Re tangential flow also apply to the skew flow in Table 9, 
however, the latter differentiates between IP, which repeats its tendency to improve as Re grows, 
and T, which no longer matches the physics of the problem, and is worse than IP, although it is 
still superior to the Fourier-based methods as a module of the block triangular preconditioner. 
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We also note that the spectral probe technique captures a significant part of the physics in 
this case, adapting to the co-dominant normal convection and tending to a constant iteration count 
without breaking down. It thus rescues the spectral technique and indicates how the robustness of 
the spectral preconditioner can be maintained with a compromise in efficiency. In a general purpose 
code, the elements of the D matrix could differ from unity but be bounded artificially, allowing 
partial tangential adaptivity with full normal adaptivity. 
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Table 10: Same as Table 7, except for jet convection. 


As in the spectral equivalence tests in Table 5, the jet case recorded in Table 10 tends to 
diminish the extremes of preconditioner behavior relative to the uniform skew convection case. 
The IP and SP results worsen while the D and T methods nearly hold their own relative to Table 9. 
The pure spectral method survives at a higher overall Reynolds number because the tangential 
velocities at the interface are lower than the maximum core velocity of the jet, to which Re is 
scaled. 
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Table 11: Same as Table 7, except for cell convection. 


Again, the cell case of Table 11 is an equalizer for most of the methods; however, the perfor- 
mance of the spectral probe technique is singularly good. 1 his is easily explained as follows: the 
average tangential velocity along the interface is zero because of the symmetry of Figure 2(f), so 
D = / for both S and SP. However, S also employs an zero average for the normal velocity, whereas 
SP adapts to strong inflow and outflow along opposite halves of the interface. The performance 
of IP, which improves with Re in all previous tables, deteriorates in this table because no pair 
of coupling matrices is weak (see comments on Table 2) under recirculation. Performances under 
the structurally symmetric and block tridiagonal schemes of the tangential preconditioner become 
similar at high Re. The recirculating cell flow is in some sense a worst case for a single interface. If 
the domain of Figure 2(f) is further decomposed by a vertical interface, putting a vertex in the cen- 
ter, all four interfaces would be free of two-signed velocity components, and easier to precondition. 
(The cross-point preconditioning then becomes an important subject.) 
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Figure 3: Contour plots of the solution to V ■ cVu = / 
with homogeneous Dirichlet boundary conditions and / = 1 
for various ratios of the (piecewise constant) diffusivity in 
the adjoining subdomains. “Ratio” is (i 0 wer/( upper an d the 
product (lower (upper is maintained at unity. 

4.3. Piecewise Constant Diffusivity 

This subsection focuses on the effectiveness of the preconditioners in heterogeneous problems. 
With reference to Figure 3, we assign a higher difTusivity in the lower domain than in the upper. 
We report convergence results in two limits: fixed diffusivity ratio and increasing h * (Table 12) 
and fixed h -1 and increasing diffusivity ratio (Table 13). 

The performance of the two probe and the tangential preconditioners in the terraced diffusivity 
case in Table 12 is identical to their performance in the uniform case of Table 1. However, the 
Dryja preconditioner is slightly worse and the spectral preconditioner (based on the arithmetically 
averaged diffusivity) is much worse than in the uniform case. In fact, S is the worst preconditioner 
in the terraced diffusion case while SP is the best, the probe supplying crucial information. 

As the ratio of diffusivities increases for a fixed problem size, as in Table 13, all methods 
asymptote monotonically to fixed convergence rates, as shown theoretically in [13]. The physical 
interpretation of the asymptotically large diffusivity ratio is that virtually all of the variation 
of solution in response to the (fixed) forcing occurs in the subdomain with the lesser diffusivity. 
Figure 3 illustrates this phenomenon. Flic contours arc kinked at the intei face since it is the normal 
component of the diffusive flux, cVu, not Vu itself, which must balance on either side. The solution 
along the interface is asymptotically the boundary value of zero. The spectral probe method is a 
singularly good performer in this limit. 

4.4. Sensitivity to Aspect Ratio 

Table Id examines the constant diffusion case under aspect ratios ranging from l:-jg (a squat 
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Table 12: Iteration counts for the step diffusion problem as 
a function of mesh parameter for two different preconditioner 
structures and five different interface blocks. The ratio of the 
diffusion coefficients of the two subdomains is 64. 
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Table 13: Iteration counts for the step diffusion problem at 
a constant mesh parameter of 1/32 as a function of diffusiv- 
ity ratio for two different preconditioner structures and five 
different interface blocks. 

rectangle with subdomains just two cells high ) to 1:2 (a tall rectangle composed of two square 
subdomains). Note that the discrete length of the interface, n, remains constant at 63 in all 
examples, while heights and m 2 vary from 1 to 63. 
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Table 14: Iteration counts for the pure diffusion problem 
at a constant mesh parameter of 1/64 as a function of as- 
pect ratio for two different preconditioner structures and five 
different interface blocks. 


We confirm that S and SP are completely adaptive to aspect ratio in this const ant c oefficient 
problem, which is not true of any other method, including Dryja’s. The tangential preconjitioher 
is understandably good when the narrow (more strongly boundary influenced) direction is the 
tangential one, and poorer when this aspect ratio is reversed. The nonmonotonic character of 
IP is interesting, showing that it adapts well to cither extreme and is poorest in the balanced 
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intermediate case. 

5. Conclusions 

We conclude by pulling together some overall assessments of the preconditioners tested in the 
previous section. The interface probe technique is the most general purpose and robust of the meth- 
ods. It is always definable and adapts well to high Re and extreme aspect ratio, but is nonoptimal 
at high h~ l and is occasionally the worst method. It does very well in predominantly unidirectional 
strongly convective flows. Generalizations to multiple interfaces, multiple components, and inexact 
subdomain solves are straightforward, but not pursued herein (see [26]). 

The spectral method is mathematically the method of choice asymptotically in /i” 1 , where 
physically well resolved problems should end up. However, since it is based on a global constant- 
coefficient approximation it docs not perform well in heterogeneous problems. Furthermore, high 
cell Reynolds numbers can cause it to go unstable, and flow fields whose actual values along the 
interface differ greatly from their average values are poorly represented. A stabilizing technique 
was proposed which could preserve the robustness of the spectral method at high cell Reynolds 
numbers, namely selection of an exponent base for D in (3.3) closer to unity than the true |a/c|. 

The spectral probe method is as good as the spectral method when W alone is a good ap- 
proximation to the eigenvectors of C (f.e., low tangential convection). A J5-modulated version of 
spectral probe can be just as effective (and just as vulnerable to strong tangential convection) as 
the pure spectral method in a constant coefficient problem. SP adapts better than S to normal 
convection variation and it adapts perfectly to piecewise constant heterogeneities in the diffusivity. 
As a “probe” method, it shares the coding disadvantages of IP. 

The Dryja preconditioner is never exclusively the best method, but is as good as either S or SP 
in a variable coefficient problem in the limit h — ► 0, where the diffusive contributions to A dominate. 
However, the extra cost of S is insignificant compared to D, and SP costs only extra subdomain 
solves in the pre-processing, so these techniques (suitably stabilized for tangential convection) will 
almost always be preferable in applications. 

All of the above techniques are relatively insensitive to the choice of the overall preconditioner 
structure. The tangential interface preconditioner is an exception, for reasons yet to be explained 
theoretically. It is much better under the block triangular form of the overall preconditioner, and 
is very competitive with the other techniques under physically predictable circumstances, namely 
tangentially dominant convection or narrow aspect ratio. It is also simple to code and has been 
successfully generalized to multicomponent systems (see [22]). 

We note that when exact subdomain solves are performed, the block triangular form of the 
preconditioner, with its one-directional flow of information from the separators to the interiors, 
is almost always preferable to the structurally symmetric form B\ in terms of execution time, 
since a full set of subdomain solves is saved at each iteration and the iteration counts are usually 
comparable. The structurally symmetric form is also obviously useful when A is itself symmetric, 
since it then admits preconditioned conjugate gradients rather than GMRES as the iterative solver. 

Clearly, a user who understands his problem physically and is willing to customize can do well 
by choosing among a variety of interface preconditioner modules, perhaps using different ones on 
differently resolved grids within the same overall solution procedure. Hopes of finding a single “best” 
method from among those considered must clearly be dismissed, but the field is still young. It is clear 
that general purpose adaptivity will require some form of “probing” via subdomain solves; taken to 
the limit, one obtains C directly. Such probing has an associated cost comparable to an iteration 
step, and must be undertaken conservatively. Future developments which iteratively improve the 
interface preconditioning based on accumulated subdomain solve data would be welcome, and so 
would more hybrids along the lines of the spectral probe which incorporate both analytical and 
numerical data. 
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Abstract 


In this paper we present a preliminary report on our work on the tracking of inter- 
nal layers in a singularly-perturbed convection-diffusion equation. We show why such 
tracking may be desirable, and we also show how to do it using domain decomposition 
based on asymptotic analysis. 


1 Introduction 


In this paper we present the analogue of a shock-tracking scheme for the resolution of an 
internal layer and its interaction with an ordinary boundary layer at the outflow. In the 
computation of compressible flows at high Mach number there has long been competition 
between shock tracking and shock capturing, and it is now generally agreed that both are 
needed. We generally find that the number of strong shocks is small, and they should be 
tracked in order to assure accuracy of the solution. For reasons of efficiency, however, the 
large number of weak shocks reverberating around the domain should be computed by a 
reliable shock-capturing scheme such as Roe’s method [14] or the method of Colella and 
Woodward [7]. Shocks are not always the most important features in fluid flows, but the 
tracking of such other phenomena is still far behind shock tracking. We first show why it 
may be desirable to track an internal layer, and then we show how such tracking may be 
accomplished via domain decomposition. 

For the sake of simplicity we consider a specific time-independent, singularly-perturbed, 
convection-diffusion equation 


a d x u + b d y u = e A u + F 


( 1 . 1 ) 
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on a bounded domain SI in the (x, y)-plane. Here, A denotes the Laplace operator, and e 
is a small, positive number. For the moment, we impose Dirichlet conditions u = f on the 
boundary dSl, but later we sometimes treat mixed boundary conditions. The function / 
is required to be piecewise smooth. (We use the term ‘smooth’ to mean some convenient 
degree of differentiability, say C 2 .) We assume that the coefficients a and 6 are smooth 
functions of x and y on SI. The source term F is assumed to be a smooth function of 
x, y, and u. Furthermore, we impose the restriction that |a| -f ]6| ^ 0 in the closure of 
SI. This assumption implies that there are no stagnation points, and it greatly diminishes 
the complexity of the domain decomposition. Our assumption of semilinearity is much less 
restrictive because nonlinear problems are often solved via a sequence of linear problems 
with variable coefficients. Our discussion does not pertain to shock layers, however, since 
they violate the assumption of smoothness of a and b. . 

Previous work [4], [11], [12] on algorithms for (1.1) using domain decomposition based 
on asymptotic analysis has treated the special case of b = 0. It is true that a transformation 
of coordinates may be used to convert (1.1) to the case 6 = 0. In this paper we show why 
such a transformation is very desirable, and we present an algorithm to carry it out. 

The development of numerical methods for (1.1) in the singularly-perturbed case requires 
an understanding of the asymptotic behavior of its solutions as c j 0. We therefore begin 
with a brief description of the relationship between u and the solution U of the reduced 
equation 

ad x U + bdyU = F. (1.2) 

For more details see the books by Chang and Howes [l] and Eckhaus [8]. Equation (1.2) is 
easily solved by the method of characteristics, 


dx dy dU 

ds ’ ds ’ ds 


(1.3) 


The first two equations in (1.3) define characteristic curves. It is clear that we cannot 
impose the boundary condition U = / at every intersection of a characteristic curve with dSl. 
Instead, we subdivide dfl into three sets, depending on the direction of the vector (a, 6). The 
inflow boundary Tj is the subset of dSl on which (a, 6) points into fi, the outflow boundary T 0 
is the subset of dSl on which (a, 6) points out of 0, and the tangential boundary F T is the 
subset of dSl on which (a, 6) is tangent to dSl. For (1.3) the boundary condition U = / is 
imposed only on the inflow boundary T/. 


It is reasonable to expect to have u ~ U for the solutions u of (1.1) wherever Atx is not 
too large, i.e, wherever u is smooth. Because of the smoothness of the source term F and 
of the coefficients a and 6, the only mechanism for introducing nonsmooth behavior into the 
solution u of (1.1) is through the boundary condition u = f. One obvious difficulty is that 
we cannot force U = f on the outflow and tangential boundaries To U Ft- The resolution of 
this difficulty is that there are boundary layers across which u changes rapidly from u fn U 
to u = f. More precisely, when / is smooth the relation u ss U is true except in the following 
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portions of ft. There may be what are called parabolic boundary layers along the tangential 
boundary IV, and there may be ordinary boundary layers along the outflow boundary To. 

Let us take a moment to explain the terminology ‘ordinary boundary layer’ and to de- 
scribe its properties. Consider a point P on To- In the vicinity of P we may construct a 
transformation (cr, r) t-> ( x,y ) such that the origin (fr,r) = (0,0) is mapped to the point P. 
We may further require that the portion of a neighborhood of the origin with <t > 0 is mapped 
into ft, so that a segment of the axis a — 0 is mapped onto a segment of the boundaTy To. 
In terms of the variables cr and r the differential equation (1.1) takes the form 

% 

2 dtjU + b d T u = e(ci d 2 a u + ci d„d T u + c 3 d\u) + F. (1.4) 

Note that the definition of outflow boundary implies that if Ci is chosen to be positive, then 
it follows that a < 0. We have seen that we expect to have u m U away from the boundary 
<7 = 0, while we require that u = f on the boundary <j = 0. That is, we expect u to vary 
slowly with respect to r but rapidly with respect to <7. Let us therefore introduce the scaling 

<7 = e<7, r = f (1.5) 

into (1.4). If we formally discard all but terms of the order of 1/e, we obtain a reduced 
equation 

ad*V = d&*V. (1.6) 

The term ordinary boundary layer derives from the fact that (1.6) is an ordinary differential 
equation. The term exponential boundary layer is also used, because the solution V of (1.6) 
is the sum of a constant and an exponential function. Note that because ajc\ < 0, this 
exponential decreases with increasing d. Note also that in terms of the variable a the rate of 
decrease is of the order of exp {— /cer/e}, where k is an average value of |a|/c}. We therefore 
expect the width of the ordinary boundary layer to be O(e). . The book by Chang and 
Howes [1] gives theoretical justification for all of these heuristic manipulations. 

In the vicinity of the tangential boundary IV we use the characteristic curves (1.3) to 
define one set of coordinate lines, and we use them as the foundation for a local mapping 
(s,t) t-> (x,y) in the vicinity of a point P on IV. In terms of these coordinates (1.1) takes 
the form 

d„u - e(c 4 d]u -f c 6 d s d t u + c 6 dfu) + F. (1.7) 

We remark that the definition of flow direction implies that c« > 0. We may require that a 
segment of the axis t = 0 maps onto a segment of the boundary IV and that positive values 
of t correspond to points in the interior of ft. Thus, the boundary layer has to accommodate 
a rapid transition from u ~ U for t > 0 to u — f at t = 0. Let us therefore introduce the 
scaling 

S = S , t — tyfl ( 1 . 8 ) 

into (1.7). If we formally delete all terms smaller than 0(1), we obtain the reduced equation 

diW = c 4 d?W + F. (1.9) 
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This equation is parabolic, giving rise to the term parabolic boundary layer. Furthermore, 
the thickness of the boundary layer for (1.9) is 0(1) with respect to t. With respect to t 
the parabolic boundary layer is therefore of width 0(y/e). Again, theoretical justification for 
these remarks may be found in [ 1 ], 

If / has a discontinuity at a point P on Tj, then by (1.3) the function U will have a 
corresponding discontinuity in along the characteristic curve 7 through P. Similarly, if 
the Lie derivative of / along T/ is discontinuous at P, then grad U is discontinuous along 7 . 
Because u is smooth, the lack of smoothness of U causes u to deviate substantially from U in 
a neighborhood of 7 . As in the case of a parabolic boundary layer, if we introduce coordinates 
(s,<) derived from the characteristic variable a as given by ( 1 . 3 ), we find that ( 1 . 1 ) maps to 
an equation of the form (1.7) and that (1.9) is the appropriate reduced equation. We are 
therefore led to the conclusion that such an internal layeris parabolic in nature and that its 
width is 0{^/l). We again refer the reader to [ 1 ] for further justification. 

In the next section we analyze the behavior of a standard central difference scheme when 
there is an internal layer tilted at an angle to the grid, and we show that the numeri- 
cal approximation introduces downstream oscillations unless the interned layer is resolved. 
Therefore, we must use either a fine grid, an artificial increase of the viscosity e, or a grid 
aligned with the layer. Here we are discussing a grid effect, in [13] Hedstrom and Osterheld 
showed that the numerical errors for a coarse grid aligned with an internal layer are minimal 
even at large cell Reynolds numbers. 

In Section 3 we present an algorithm for the construction of an orthogonal grid with 
one coordinate direction aligned to the vector field (a, b). This mapping requires the solu- 
tion of the telegraphers’ equation. In Section 4 we introduce a domain decomposition for 
a problem ( 1 . 1 ) with an internal layer and an ordinary boundary layer. In this domain de- 
composition the ordinary boundary layer and the internal layer have their own subdomains, 
and there is a separate subdomain for the region where these layers interact. In addition, 
in each subdomain a separate numerical method is used, depending on the local asymptotic 
behavior of the solution. 


2 A layer tilted to the grid. 

In this section we use a heuristic argument based on the modified equation to show why it is 
generally unwise to permit an internal layer not to be aligned with the grid. Specifically, we 
show that the standard central-difference scheme has grid effects which are modelled by a 
modified equation in the style of Warming and Hyett [16]. See Griffiths and Sanz-Serna [10] 
for a more modern exposition on modified equations. We shall see that the solutions of the 
modified equation are integrals of Airy functions, multiplied by a decaying exponential. The 
oscillations of this Airy function may or may not be completely damped by the exponential, 
depending on the values of a dimensionless parameter. We also derive the modified equation 
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for the upwind difference scheme, and as may be expected, we find that upwindilig adds 
numerical diffusion. 


For the discussion here we restrict our attention to the special case when the coefficients 
a and b in (1.1) are constant and the source term F vanishes. Then for convection with 
velocity V in the direction (cos 0, sin #) we have the convection- diffusion equation 


V cos 6 d x u + V sin 0 d y u = e An. (2.1) 

The reduced equation for (2.1) is 

V cos 0 d x ll + V sin d d y U = 0, (2.2) 

and its characteristic curves are given by 

^ = Vcos6, ^L = VsmO. (2.3) 

For the discussion here it suffices to restrict our attention to directions 0 < 6 < tt/4. We 
remark that the special case 9 = 0 of flow parallel to the grid was examined by Hedstrom 
and Osterheld [13]. 

For (2.1) we use a rectangular domain 

ft = {(*>y):0 < x < 1,0 < y < 1}. (2.4) 


Thus, under the conditions that 0 < 9 < 7t/4 the inflow boundary is at x — 0 and at y = 0, 
and the other two sides of the rectangle comprise the outflow boundary. On the inflow 
boundary we select a point of discontinuity y 0 , and we impose the conditions 


| 0.5(1 -f sgn(y — y 0 )) for x = 0, 
\ 0 for y = 0. 


(2.5) 


The discontinuity in the boundary data at y 0 induces an internal layer along the characteristic 
curve a: sin# — ( y —y 0 )cos9 = 0. In fact, the solution U to the reduced equation (2.2) with 
boundary data (2.5) is given by 


U = ^ - ^sgn(zsin# - (y - y o )cos0). - (2.6) 

In order to minimize ordinary boundary layers along the outflow boundary, we impose the 
reduced equation (2.2) as boundary condition at x = 1 and at y = 1 . * 

Consider the standard central-difference scheme for (2.1). We impose a square grid on ft 
with mesh size h, and we define the shift operators 


T x v(x, y) = v(x + h, y), T y v(x, y) = v(x, y + h ). 


(2.7) 
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With this notation the central-difference approximation to (2.1) is 


VCOS \t, - + ^££(T,- T-')v = <Dv, 


2h 


2h 


( 2 . 8 ) 


where D denotes the discrete Laplacian 


D = ^(T x + T y + T- 1 + T- 1 - 4/). 

On the inflow boundary T/ the boundary conditions for (2.8) are (2.5). On the outflow 
boundary To we use an upwind discretization of (2.2). 

» 

The modified equation for (2.8) is best written in terms of the rotated coordinate system 
aligned with the flow direction • 


s = x cos 0 + (y — y 0 ) sin 0 , 
t = —x sin 0 + (y — y 0 ) cos 0. 


(2.9) 


We also introduce scalings of s and t in order to derive a modified equation in dimensionless 
form and to identify the pertinent parameters. It happens that for (2.1) or (2.8) on the 
halfplane x > 0 there is no natural length scale in the direction of the flow (the s-direction). 
One may as well use a length scale L — 1. For the rectangular domain ft defined in (2.4) it 
is reasonable to take L to be the width of ft (L = 1) or the width of ft in the s-direction 
(L — sec#). We shall see that the natural scalings for the modified equation corresponding 
to the centred difference method (2.8) are 

s = La, 


Furthermore, the important dimensionless parameters for (2.8) are the cell Reynolds number 



e 


and the degree of streamwise resolution 


h 



( 2 . 11 ) 


( 2 . 12 ) 


In terms of these parameters the modified equation for (2.8) is given by the following theorem. 

Theorem. Suppose that 0 < 0 < w/4. Suppose also that 7 < 1 and that 7 *C Rh *C 
I/7. Then the modified equation for (2.8) is 


= dlv - + (j- h + ~^t (3 + cos ( 4 "))) a >- ( 2 - 13 ) 
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Remarks. The restriction that 7 = h/L <C 1 is reasonable for numerical computations, 
since we would want features in the streamwise direction to be resolved. The condition that 
7 <C Rh I/7 is also ordinarily satisfied in computations. We have written the modified 
equation in the form (2.13) in order to provide uniformity as sin(40) -+ 0. The grid-induced 
oscillations appear only when sin(40) ^ 0 and when Rh is moderately large. (Remember 
that we require Rh~f C 1.) Under the condition that J?/,sin(4tf) is bounded away from zero, 
the modified equation (2.13) takes the simpler form 


d.v = di v - 0 r v 


In (2.14) the parameter 


P = 


sin(40) 

24 




(2.14) 


(2.15) 


measures the importance of the grid-induced numerical dispersion relative to the physical 
diffusion, and no numerically induced oscillations will be observed if /? is sufficiently small. 
For boundary data v — sgn(r) at r = 0 the solution of (2.14) may be expressed in terms of 
the Airy function, as is shown by Chin and Hedstrom [3]. In fact, a Fourier transformation 
with respect to r shows that 


j + 


/: 


1 


i27ru> 


exp 


{i/?<ru ; 3 — 


era; + iru; 


| dw. 


(2.16) 


The reference [3] also provides figures and tables of the integrals (2.16) for various values 
of (3. The upshot is that whether or not there are oscillations depends on a parameter 


2, t 1 ' 3 

(3£) 2 / 3 ' 


(2.17) 


If a > 2, the diffusion is dominant, and there are no oscillations. For a < 2, however, there 
is a sequence of damped oscillations below the layer (r < 0). Because a is an increasing 
function of <r, as we proceed downwind a increases and the diffusion eventually removes the 
oscillations. 


Note with regard to the applicability of the modified equation that the internal layer is 
many grid cells wide and that the oscillations have wavelengths spanning many grid cells. 
This behavior makes the modified equation applicable, in that the derivation of a modified 
equation is based on the assumption that the numerical solution is smooth relative to the 
grid. The user of modified equations must always keep in mind that they are utterly useless 
in predicting variations in the numerical solution on the scale of 2 or 3 grid sizes or shorter. 


Finally, we also remark that the upwind difference scheme 

cos 0 / -1 v sin 0 1 . 

V—V - T - 1 ) » + V— (I - T~ l )u - e Du 

with the scalings (2.10) has the modified equation 

d a v — ai d^v + a 2 dlv 
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with 


<*i = 1 + ^ sin(20) cos (f - $) , 

«2 = 7^ + ^(3 + cos(40) + 24(cos 3 0 -f sin 3 0)). 


The most significant numerical viscosity added by the grid effect is the deviation of aj 
from 1. * 

Proof of the theorem. The idea of the modified equation is to make an ansatz that 
the solution of the difference scheme is smooth enough to be represented by a small number 
of terms of its Taylor expansion and to use this expansion to identify a partial differential 
equation which approximates (2.8) more closely than the original equation (2.1) does. 

Thus, we begin with the assumption that some smooth function v is a solution of (2.8). 
In this case the word ‘smooth’ is taken to mean that we may use Taylor approximations such 
as 

T x v = v + h d x v + y d 2 v + y d 3 v + ^ d*v (2.18) 

for the terms in (2.8). 

That is, we choose to neglect terms in the Taylor series approximation to (2.8) of order 
h B and higher. We therefore obtain the equation 


V cos $ (d x v + + V sin 6 [d y v + ~ d 

= e (< %v + d 2 y v + £(d*v + d*v)) . 


The rotation of coordinates (2.9) then gives 

Vh 2 eh 2 

Vd,v + -y- L 3 [v ] = e(d 2 ,v + d 2 v) + — L 4 [v] (2.19) 

with 

L 3 [v] = |(3 + cos(40)) d?v - f sin(40) d]d t v + f sin 2 (20) d,d}v + \ sin(40) dfv, 

L t [v] = ±(3 + cos(40))(d> + dfv) - sin(4 B){d]d t v - d,dfv) + 3 sin 2 (2 0)d]d 2 v. 
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Because we are interested in the effects of the internal layer, we expect derivatives of v 
in the ^-direction (perpendicular to the layer) to be significantly larger than derivatives in 
the ^-direction. The scalings (2.10) are selected to balance the terms d 9 v and dfv in (2.19). 
Thus, upon substituting (2.10) into (2.19) and dropping terms smaller than O^iZh), we 
obtain 

d a v = dlv-p d 3 v + ^-d\v + 2^(3 + cos(40)) d*v (2.20) 

with P as defined by (2.15). 

The inexperienced user of modified equations may expect (2.20) to serve as a modified 
equation for (2.8). We cannot use it because the term involving B^v renders (2.20) unstable to 
high-frequency perturbations. The use of such a modified equation would predict numerical 
instabilities where there are none, and it is an instance of a modified equation not conforming 
to the difference scheme for phenomena of short wave length. The term d*v appears in (2.20) 
because we stopped the Taylor expansion (2.18) at d*v, and we went that far because the 
coefficient P of d 3 v in (2.20) is zero when sin(4d) = 0. That is, we must replace by 
something reasonable but harmless when P is near zero, and for other values of P it need 
only be something harmless. Because d a v ~ d\v when P fa 0, the substitution we make to 
render d*v harmless is that d*v fa d 3 v. In this way a high-frequency instability is converted 
into an increase of dissipation in the streamwise direction, and it produces our modified 
equation (2.13). (This trick was also used in [13] in the special case 9 = 0.) 


Remarks to mathematicians. The above argument contains some sleight of hand. 
In particular, the domain Q was replaced by the halfspace s > 0 or, equivalently, a > 0. 
This change removes any ordinary boundary layer which may be present at the outflow. 
In addition, boundary data for (2.13) are to be applied at the rotated boundary s — 0. 
We expect these distortions to introduce discrepancies only near the point of discontinuity 
(x,y) — (0>J/o)> In particular, it appears from our computations that there needs to be 
a slight shift of coordinates. See the comments concerning Fig. 1 below. We have also 
not shown that the solution of the modified equation (2.13) bears any resemblance to the 
solution of the difference scheme (2.8). Such a proof would probably proceed as in [13] with 
the replacement of 0 by the halfspace x > 0, followed by a Fourier transformation of (2.8) in 
the y-direction. The modified equation (2.13) shows that the canonical form of the integral 
representation of v is 


v(<r,r) 


1 r 

2 +/_ 


f(u>) /• 3 

exp < ta 3 o; 

z27ru7 *• 


a 2 U> + iaiio 


| du> 


( 2 . 21 ) 


with / and the a j dependent on a and r. Furthermore, the integral (2.16) indicates that 
f fa l, a 3 fa Pa , a 3 fa <r, and ai fa r when cr 1 and |r| •< 1 . The integral (2.16) derived 
from the modified equation (2.14) is merely a nonuniform asymptotic approximation which 
is valid when |r| <C 1, cr 1, and when /? is bounded away from zero. We see from the form 
of (2.21) that a uniform asymptotic estimate would require investigation of the interaction of 
two saddle points and a pole. For the case when sin(40) = 0 the situation is simpler because 


? 

I 
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a 3 — 0 and there is only one saddle point and a pole. Uniform asymptotics for 9 = 0 are 
presented in Hedstrom and Osterheld [13]. 

A computational example. In our computations to illustrate these oscillations we 
located the point of discontinuity at y 0 = 0.25, we chose coefficients 

V cos 9 — 2, U sin 0 = 1, e = 0.002, 

and we used a mesh size of h — 0.02. This gives a cell Reynolds number of moderate size 
Rh = 10\/5) and with L — 1 it gives 7 = 0.02. The scaling (2.10) is therefore yjLcJv « 
0.0946, and the value of j3 in (2.15) is /? fs 0.598. The cross section at x = 0.8 is shown in 
Fig. 1, where the solution to (2.8) is shown as a solid curve and the Airy integral (2.16) is 
given as dashes. We must admit that in order to obtain such a good match of the curves, 
we had to shift the jump for the Airy integral from y 0 to y 0 -f h. This could be because the 
Airy integral applies to the rotated coordinate system (s,f) given by (2.9). It should also be 
noted that there is a phase difference between the two curves in the oscillatory region. This 
is a well-known deficiency of modified equations, and it results from the nonuniformity of 
the asymptotic approximation. At the point ( x,y ) = (0.8, 0.6) near the overshoot the value 
of the parameter a given by (2.17) is a » 1.294. We have oscillations because a < 2. 

The numerical method we used to solve (2.8) is a combination of ideas from Elm an 
and Golub [9] and from Chin and Manteuffel [6]. As in Elman and Golub, we introduce 
a red-black ordering on the grid points and do a cyclic reduction to obtain a nine-point 
scheme on the black grid points. This reduction produces a matrix much better conditioned 
for iterative methods. The iterative method used by Elman and Golub is point Jacobi, 
mostly because they impose no constraints on the direction of flow. In our example the 
flow is one-directional, so we follow Chin and Manteuffel in using line Gauss-Seidel with 
lines transversal to the flow, starting at the inflow boundary and marching downstream. We 
find that this scheme converges very rapidly, with the greatest speeds at high cell Reynolds 
numbers. (Perhaps, we should reiterate that the point of this section is to show that rapid 
solution of the matrix equation should not be the primary objective — its solution is a poor 
approximation to the solution of the differential equation when the parameter (3 in (2.15) is 
large.) 

Let us remark that we have also solved (2.8) in a version with a discrete approximation 
to Neumann outflow boundary conditions 

d x u = 0 for x = 1, , 

d y u = 0 for y — 1. ( 2 ‘ 22 ) 

We found this boundary condition to be satisfactory only for small cell Reynolds number, 
Rh < 5. Otherwise, there are additional small oscillations with period 2 h induced by the 
mismatch at the outflow boundary x = 1. 
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3 Curvilinear coordinates 


In this section we permit the coefficients a and b in (1.1) to depend on the position (jc,y), 
and we present a numerical algorithm for generating an orthogonal coordinate system (a 
chart) aligned with the given vector field (a, b ). Our coordinate system is derived from the 
characteristic curves. We remark that a somewhat different coordinate transformation based 
on characteristics was given by Chin et ai. [5]. 

We again assume that the vector field (a, b) has no stagnation point, so that |a| + |6| is 
bounded away from zero for all (x,y) in SI. For purposes of constructing the mapping, it 
is convenient to do an initial scaling so that a 2 + b 2 = 1. One of our goals is to set up a 
mapping (s,t) *—> ( x,y ) such that s follows the flow in the sense that there exists a positive 
function (p for which 

— <P{ad x + bdy). (3-1) 

Because the vector (—6, a) is orthogonal to (a, 6), the orthogonality requirement (our second 
goal) amounts to the condition 

dt =ip{-bd x + ad y ) (3.2) 

for some positive function In a moment we show that the scale factors <p and ip are not 
arbitrary. 


In part, the construction of such a mapping is easy, because it is easy to integrate (3.1). 
All that is needed is to pick a convenient starting point (xo,Jto) and to integrate the system 


dx 

— = a<P, 

as 

dy A A 

— = b<p, 

as 


x = xo at s = 0, 
y = y 0 at s - 0. 


(3.3) 


This gives a curvilinear coordinate line in f2 corresponding to a constant value of t. The 
image of a line s = constant may be obtained similarly by integrating 


dx 

dt 


-bip, 



x = a; 0 at t — 0, 
y — y 0 at t = 0. 


(3.4) 


We still must ensure global consistency as follows. Let us traverse the edges ‘of the 
curvilinear rectangle sq < s < f 0 < t < <i, and we assume that this rectangle is contained 
in SI. Denote the image of (so,to) as the vertex A. Suppose further that we integrate (3.3) 
from s 0 to si, arriving at the vertex C. We then integrate (3.4) from t Q to t\ and arrive 
at the vertex B opposite A. Let us now reverse the order by first integrating (3.4) from t 0 
to t\ to arrive at the vertex D and then integrate (3.3) from So to sj. Can we be certain 
that we again arrive at the vertex B? It happens that this global consistency question has 
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been answered [15], and that what is required is the vanishing of the Lie bracket [#,,<?*] = 

d,d t - d t d,. 


It is easy to see by a short computation that the vanishing of the Lie bracket [d,,d t ] is 
equivalent to the system of partial differential equations 


- d t (b4 >), 

d t (bip) = -d t {a<t>). 


(3.5) 


Upon differentiating the products in (3.5) and solving for d ,if> and d t 4>, we find that a 
necessary and sufficient condition for consistency is that 


(a 2 + b 2 ) d,ip = <f>(a d t b — b d t a ) — d,a -f b d,b ), 
(a 2 + b 2 )dt4> = —<f>(ad t a + bd t b ) + i{’(bd t a — ad,b). 


(3.6) 


Note that if (a, b) has been scaled so that a 2 + b 2 = 1 , then (3.6) takes the simpler form 

(3.7) 


d 3 ip = 4>(a d t b - b d t a ), 
d t <f> = ifi(b d $ a — a d 9 b ). 


We recognize the system (3.7) as the telegraphers’ equation, written in terms of Lie 
derivatives along the characteristic curves. Therefore, all that is needed for its solution is to 
prescribe values <f> — 1 at t = 0 and ^ = 1 at s = 0 and to march in the s and ^directions 
concurrently. 

It should be emphasized that theoretical questions remain for this grid-generation scheme. 
In particular, there is no guarantee that the solutions <j> and ^ will be positive at all points 
in ft. This is important in that the Jacobian of the transformation (3.3-4) is given by 
J = (a 2 + 6 2 )^. We required at the outset that a 2 -\-b 2 be bounded away from zero. Thus, if 
we are to maintain a nonzero Jacobian, we must take special measures whenever it happens 
that <f> < 0 or ^ < 0. One possibility is to back up and put a boundary on this local chart. 
We could then initialize a new chart and continue. 


4 Domain decomposition for an internal layer. 


In this section we present a computational example which uses domain decomposition to 
resolve an internal layer. At this point we have not yet implemented the algorithm described 
here, but the final report will have computations. In our algorithm we first identify the 
internal and boundary layers, and we then set up a domain decomposition to segregate 
them. The domain decomposition is carried out with overlapping grids using the tools of 
Chesshire and Henshaw [2]. We have added the modification that in some subdomains we 
use the grid-generation algorithm of Section 3. 
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As our domain fi we use the square 0<x<l,0<y<l, and on f] we consider the 
convection-diffusion equation 

(1 + x)d x u + (1 - y)d y u = eAu. (4.1) 

As boundary conditions for (4.1) we prescribe u = 0 on the bottom of 0, (y = 0), u = 1 on 
the left-hand edge (x = 0), u = 1 on the top ( y = 1), and u = — 1 on the right-hand edge 
(x = 1). 

Note that in (4.1) we have chosen coefficients so that there is no turning point in (1. That 
is, we have |1 -f x| -f |1 — y \ bounded away from zero in fi. Note also that by the discussion in 
Section 1 the inflow boundary T/ consists of the bottom y = 0 and the left-hand side x = 0 
of the square f). Furthermore, the top of the square y = 1 is a tangential boundary Ty, and 
the right-hand edge x = 1 is an outflow To. The reduced equation is 

(1 + x) d x U + (1 — y) d y U = 0, ' (4.2) 

and its boundary conditions are imposed on the inflow boundary Tj. It so happens that 
we can write down a formula for the solution U of (4.2), although this is not necessary for 
our domain-decomposition algorithm. The characteristic curves for (4.2) are the hyperbolas 
(x — l)(y + 1) = const. Thus, the solution of the reduced equation (4.2) is 

rj _ \ 1 if y >x/(x + 1), 

(0 if y < x/(x + 1). 

This gives us an internal layer along the hyperbola y = x/(x + 1) and exponential boundary 
layers at the outflow boundary x = 1. It happens that we imposed boundary data along 
the tangential boundary IY such that no boundary layer resides there. If there had been a 
boundary layer along IV, we could have modified the domain decomposition described below 
so as to include its effects. 

As the problem is stated, we need the following subdomains: (1) a square B of diameter 
0(e) at the origin to cover the birth of the internal layer, (2) an internal-layer region 

^ = {(*.y): \V -*/(* + 1)1 < Gy/ex} 

with O(e) < x < 1 — O(e), (3) three outflow boundary layers O, one above the internal layer, 
one below it, and one interacting with it, (4) an outer region 7i above the internal layer on 
which u ~ 1, and (5) an outer region H below the internal layer on which u ss 0. 

In the two outer regions “H we use a coordinate system derived from the characteristics, as 
described in Section 3. In the internal layer X we use a parabolic coordinate system imposed 
on the characteristics. (More precise details will be given in the final report.) Finally, in the 
birth B and boundary-layer regions O we use the methods given in the papers by Hedstrom 
and Howes [11] and [12]. The iterations are performed in the order: (1) the outer regions 7f, 
(2) the birth region N, (3) the internal layer T, (4) the outflow boundary layers O. The 
iterative schemes in the subdomains are as in [11] and [12]. 
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Abstract 

Interface conditions for coupling the domains in a physically motivated domain 
decomposition method are discussed. The domain decomposition is based on an 
asymptotic-induced method for the numerical solution of hyperbolic conservation laws 
with small viscosity. The method consists of multiple stages. The first stage is to 
obtain a first approximation using a first-order method, such as the Godunov scheme. 
Subsequent stages of the method involve solving internal-layer problems via a domain 
decomposition. The method is derived and justified via singular perturbation tech- 
niques. 


1 Introduction 

This is a report on a preliminary investigation of conditions for the interfaces between sub- 
domains when solving partial differential equations. The analysis for the method is a combi- 
nation of asymptotics and numerical analysis. The result is a physically motivated domain 
decomposition method where different partial differential equations may be solved in different 
domains. Since different modeling equations are in different subdomains for the same prob- 
lem, we call this heterogeneous domain decomposition. The numerical treatment of interface 
conditions between the subdomains must be addressed. The approach here is to examine 
the physics reflected in the numerical method used within the subdomains and guarantee 
that this same physics is reflected in the interface treatment. 

The method is best suited to partial differential equations that contain regions of singular 
behavior. A typical situation is when there are narrow regions where the variation in the 
solution is large. Such regions are called boundary layers or transition layers depending on 
whether they are near a boundary or inside the interior of the domain. Examples of such 
situations are laminar flow of a slightly viscous fluid or combustion with high activation 
energy. Classical schemes applied to these types of situations generally fail to correctly 
describe the behavior inside the layers. This difficulty is overcome by utilizing asymptotic 
analysis that reflects the physics of the problem. Here we present and motivate the domain 
decomposition method, but the details of the analysis are presented elsewhere [7]. 

There have been some intersting results regarding interface conditions for heterogeneous 
domain decomposition where Euler equations are coupled with Navier-Stokes equations [9], 

'Research conducted at ICASE, NASA Langley Research Center, Hampton, Virginia, supported by NASA 
Contract No. NAS1-18605. 
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and where viscous and inviscid equations where coupled [2, 4], Many of the basic ideas 
relating to asymptotic analysis and numerical methods that utilize domain decomposition are 
found in [10]. These ideas were incorporated into a parallel numerical method in [5]. Specific 
application to conservation laws have been developed in [1]. There are other important works 
in these areas-these references are only a small sample of the literature. 

The coupling of the problems in the subdomains is based on a balance of the flux across 
the interface. Each subdomain is treated as a control volume, and the flux into and out-of 
the control volume is balanced. This is similar to the flux- differencing methods used within 
the subdomains. The result is a numerical method with no visual artifacts. This numerical 
treatment of the interface is an extension (to heterogeneous domain decomposition) of the 
work by Osher and Saunders [11]. We expect extension of this method for the interfaces to 
work for two dimensional heterogeneous domain decomposition, since it was used for a two- 
dimensional homogeneous domain decomposition method that utilizes adaptive refinement 
[ 8 ]. 

2 Problem Setting and Domain Decomposition Mo- 
tivation 


Consider the Cauchy problem 


dU 

at 


+ £W) = (W)- f) f” Me n 

U(x,0) — V(x) for x £ ]R. 


( 2 . 1 ) 


Here the solution U £ IR n is a vector-valued function with n components, the domain is 
Q = IRx]0,T[ and e << 1 is a small parameter. 

We assume that V is piecewise smooth. We also assume F and P are regular functions 
of U. We suppose that P is a suitable viscosity matrix [3] for the shocks of the associated 
inviscid problem 

g w + &F(U°) = 0 for 

( 2 . 2 ) 

U°(x, 0) = V(x) for x £ IR. 

Namely, a shock-wave solution to (2.2) can be obtained as a limit of progressive wave solutions 
of (2.1). Problem (2.1) is a parabolic-hyperbolic singular perturbation problem driven by 
( 2 . 2 ). 

The regions where the solutions to the associated inviscid problem fail to be good ap- 
proximations to the solution of the full problem are the regions where we use a subdomain 
to localize the behavior of the solution. Thus, we have two types of domains. The first type 
of domain is located where the regular expansion 


U. 


outer 


= U° -YtU 1 + e 2 U 2 + ... 


(2.3) 
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for U is valid and the solution is smooth. The second type of domain is where the solution 
exhibits singular behavior and the regular expansion for U is no longer valid. 

We substitute U“ s utcr in the differential equation of (2.1) and use identification in e to 
obtain that U° must be a solution of (2.2). The inviscid problem (2.2) has many weak 
solutions; it is possible to uniquely define U° by considering the problem that governs U l 

[7]- 

The failure of the regular expansion is reflected by some of the terms in the PDE governing 
U° being significantly larger than other terms. Typically, the term RHS(U°) will become 
unbounded as the small parameter t tends to zero. For finite e, a large RHS(U ° ) would 
indicate that the region should be covered by a subdomain in which we apply techniques 
designed to capture the singular behavior of the solution. We describe how to use a measure 
of the numerical approximation of RHS(U ° ) to place the subdomain boundaries in a later 
section of this manuscript. 

I 

2.1 Problem in the Singular Region 

So that we can handle the regions where solutions to problem (2.1) contain shocks that 
interact with other singularities we use a brute force approach that will capture all possible 
behavior of the solution. The approach is to use the coordinate system 



in the regions with shocks. We will present and motivate the domain decomposition method, 
but the details of the analysis are presented elsewhere [7, 6]. Under this transformation the 
PDE that governs the solution becomes 


dU 

dr 





(2.4) 


where U((,r) = U(x,t). This is the equation that is solved in the singular region. 

This scaling is most appropriate for regions where shock-layers are interacting with other 
non-smooth physical phenomena. Because the transformation a priori resolves all of the 
physics. This is reflected by all of the terms in (2.4) having magnitude of order unity or 
smaller. In general, this method is overkill, similar to using a shotgun to dispatch a housefly. 
We choose to study only this brute-force approach so that we concentrate on one type of 
interface. Other treatments that include more of the physics are possible [7]. They can result 
in more efficient numerical methods than the one discussed here. 

The boundary condition at the interface is to impose that the viscous equation from 
problem (2.1) be the model at the interface between the subdomains. The computational 
implications of this condition is discussed in §4. 
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3 Conservative Discretizations 


It is important for the discretization techniques to satisfy a discrete conservation relation. 
One can verify that if the discretizations can be written in the form 


z* +1 = z? 


M^ l t+ 1/2 — ^- 1 / 2 )) 


then the method satisfies the appropriate conservation relations. Here we use flux differencing 
methods based on a finite-volume formulation of the problem. 

We will discuss the differencing method for the outer region subdomain where the solution 
is smooth first. Let Wo be the discrete numerical approximation to U°. We use a first-order 
finite- volume method. This method assumes that the value Wq^ is an approximation to the 
average of the desired function U° over the spatial interval ]ar^_ 1 / 2> Xj +1 / 2 ] at time t = kAt . 
The method can also be categorized as a flux differencing technique since the general form 
of the discrete analogue to the original PDE can be written 

WiF = K<- *(*?+./, - *?-,„) . (3.5) 

where 

*? t i /, ~ 

Here the fluxes are based on the first-order Godunov scheme; thus, the flux fj for com- 
ponent Wj of Wo is approximated as 


/*+./, = 5 [/XO + UKw) - - w,A\ 


(3.6) 


where a; is an approximation of the upper bound on the local speed of sound. 

The discretization that is used for the numerical method in the shock-layer region is 
a modification of the treatment used for the outer region. We have used a coordinate 
transformation that creates a smooth problem for this subdomain. Let Wq be the first 
order numerical approximation to U . Let Wjf* be an approximation to the the average of 

the desired function U over the spatial interval ^+ 1 / 2 ] fi me T = kAr. The flux 

differencing technique is 

(3.7) 


wf' = Wt - A(f? +1/ , - ft.,,) 


■JG 

•fl/2 


where 


[A 

i+1/2 


nHuJ 


- 1/2 > 
dw£ 

uyv I+1/2 


The particular discrete form for each component of the flux is obtained using a formula 
similar to that of Equation (3.6). 

We are not restricted to this particular numerical discretization; however, the numerical 
treatment of the interface will possibly need to be modified for different numerical treatments 
of the problems within the subdomains. 

One can verify that the flux differencing methods given above satisfy the discrete con- 
servation relation. What remains is to formulate the conditions at the interface so that the 
relation will be satisfied globally. 
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4 Treatment of the Interface 


Using the shock-layer coordinates with = CAx will result in (7/e pointsjn the shock- 
layer for each point in the outer region. Here, a typical value for e is .01; hence, this results 
in a radical grid refinement for the shock-layer. For the numerical method, since there will 
be many grid points in the shock-layer for each point in the outer-region, we will refer to 
the shock-layer grid as the refined grid, and the outer region grid will be called the coarse 
grid. The temporal coordinate will also be stretched, resulting in the situation outlined in 
Fig. 4.1. 


x i-l/2 




time 



space ► 

Figure 4.1: Interface at the left boundary 


4.1 Flux Treatment of Interface 

As in [11], we view the interface treatment as a predictor-corrector method on the coarse 
mesh. We start at time t — t k . The coarse-grid values are defined everywhere, and are the 
average of the corresponding fine-grid values when the coarse-grid volume element is within 
the fine-grid region. 

The steps for the first order method are outlined in Algorithm 1 below. At time step 
k , the shock-layer has N(k) points in the interior of the region and a ghost point on each 
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For k — 1 , 


I. March Wo from £*-i to t k based on scheme (3.5). 

II. Detection. 

A. Compute the residual on the coarse mesh. 

B. Mark regions that should be refined. (Let this be the region between Xj t _i /2 
and Xi R+1 / 2 - 

A. Modify shape of refined region. 

III. March the shock-layer region from 4 to 4+i- For k = 1 to K 

1. Form the initial condition in newly refined regions. 

2. Use linear interpolation to compute the ghost values of Wq 

3. March Wq to r^ +1 based on scheme (3.7). 

IV. Project Wq onto Wo- 

V. Correct values Wq L and IVq r based on the shock-layer fluxes. 

Algorithm 1 Numerical Method. 


side of the refined region. There are a few points that need to be clarified in this algorithm. 
The interpolation to obtain ghost values (i.e. H 7 ^) is bi-linear interpolation based on 

WqJ i 1 _ 1 and Wql-i- The initial condition for this problem is derived by imposing mass 
conservation; thus, the fine-grid values are all initialized to the value of the solution at the 
cell center. Improvements in the initialization procedure is a subject of further research. 

The correction of the coarse-grid values in Step VI is to use the same discretization that 
was used when the values were originally computed, but to modify the fluxes at the boundary 
of the domain to reflect what happened on the refined region. That is, to update Wq.l* we 
would use scheme (3.5) with (3.6) for but we would compute with the formula 


rifc-1 

1/2 


l 


K - 1 






One may verify that this results , in a globally conservative method. Also, this treatment 
of the boundary is consistent with the boundary conditions imposed in §2.1. Namely, this 
treatment of the interface is consistent with the viscous equation from problem (2.1) being 
the model at the interface between the subdomains. 


4.2 Dirichlet Treatment of Interface 

As a comparison to the flux boundary condition, we also implemented the heterogeneous 
domain decomposition method with dirichlet boundary conditions at the interface. This 
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is an interesting comparison, since there was little difference in the results when the two 
different treatments of the interface were used (this is discussed in § 6 ). 


5 Detection of Interface 

We present the detection of the interface for the sake of completeness. Detection of the 
interface based on computational data results in a method that can have a different location 
of the internal-layer subdomain for each time step. The detection for the numerical method 
is based on obtaining an approximation to 


d W 0 dF(W 0 ) 


dt 


dx 


— e 


d_ 

dx 


P(W 0 ) 


dWo \ 

dx) 


This term is the residual from using W 0 as an approximation to the solution of ( 2 . 1 ). The 
residual is of magnitude 0( Ax" 1 ) in either a shock layer or in a zone where a shock interacts 
with other singularities. , 

It is also possible to use an approximation of the viscous term J^Wo(-,£jr) to localize 
some of the singularities. For example, this viscous term will be of order f5(Aa> -1 ) in a 
shock layer or in a zone of interaction. This method is not as reliable as using the residual, 
however. Other types of behavior can be located and identified using these techniques [ 7 ]. 
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Application to the Isentropic Gasdynamic 
tions 


In this section we examine the interface treatments on the viscous isentropic gasdynamic 
equations^ : . 3 , , 

du dv .... _ : 

It ~ d^ ^ ° 


dv 

dt 


d_ 

dx 


( 1 \ d ( du\ 

Vu'v dx \dx) 


Here u is the inverse of the density and v is the velocity. These equations are obtained from 
the conservation of mass and momentum in Lagrangian coordinates assuming that u is equal 
to the pressure raised to the — 1 / ^yth power (the perfect gas law). The experiments were run 
with 7 = 2 . 2 .‘ " 

The problem is a right-traveling shock interacting with and a left-traveling rarefaction, 
both of which eminate from the origin. An analytic self-similar solution (a rarefaction emi- 
nating from the origin) to the inviscid isentropic gasdynamic equations is given by 


u(x,t) - 7 1 At+ 1 ) ^ 


a- \ -*/(?+!) 


( 6 . 8 ) 
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v(x, t ) = (f) ^ + const. ■ (6.9) 

An initial condition with a shock and rarefaction eminating from the origin is constructed by 
connecting left values to middle values with a rarefaction. The middle values are connected 
to the right values with a shock. Thus, the initial condition is given by 


where 



for x < 0 
for x > 0 

for x < 0 
for x > 0 


( 6 . 10 ) 

( 6 . 11 ) 


Ui = 1.4709, Ur = 2.5000, V L - 1.0388, Vr = 0.8050. 


The middle value of the solution between the shock and rarefaction is (Um, Vm) = (1.973, 1.356). 
We remark that the middle values were was chosen using the Rankine-Hugoniot condition 

Vm - Vr _ 1 /£% - 1/U]g 
Ur — Um Ur — Vm 

We expect the the viscous perturbation to have little or no effect on the speed at which 
shocks and rarefactions travel; thus, we will compare the viscous solutions to the solutions 
given above. 

The method was run with e = .01. The discretization parameters for numerical solution 
in the outer region have CFL number At/ Ax = .1, and Ax = .02. The discretization on 
the scaled coordinates inside the shock-layer is based on A£ = .1, with the CFL condition 
At/A£ < .025 and the stability condition Ar/A£ 2 < .1. These values are well within the 
limits imposed for the stability of the finite difference methods. 

Figure 6.2 depicts the evolution of the internal-layer subdomain when the two differ- 
ent boundary conditions are used. The errors generated by using the dirichlet boundary 
condition when the rarefaction is trying to exit the internal-layer subdomain result in a 
larger computed second derivative, and the detection scheme kept the rarefaction inside the 
internal-layer much longer. The solution projected onto the coarse grid at the end of the 
computations showed little difference between the two methods (Fig. 6.3). The primary dif- 
ference is the visual artificats at the boundary of the internal-layer subdomain at the point 
when the rarefaction is exiting the subdomain (Fig. 6.4). 


7 Conclusion 

Clearly the best interface condition is the flux-based treatment; however, the dirichlet bound- 
ary conditions did not induce as many errors as expected. One explaination of the lack of 
errors may be that the internal-layer subdomain boundary moves fast enough that waves 
propagating out of the internal-layer subdomain are allowed to pass across the bbundary 
by the oscillations in the boundary. More studies are planned with the goal to identify the 
precise nature of the errors associated with the interface treatments. 
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Evolution of IL (Dirichlet) 



Evolution of IL (Flux) 



Figure 6.2: Evolution of the Internal-layer Subdomain. 


Solutions at t = .24 (Flux) 



Solutions at t = .24 (Dirichlet) 



Figure 6.3: Solution on Coarse Mesh at t — .24. 


Solutions at t = .16 (Flux) 



Solutions at t = .16 (Dirichlet) 



Figure 6.4: Solution on Fine Mesh at t = .16. 
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One-Way Nesting for a Primitive Equation Ocean Model ^ ^ 

D.W. Blake 

Naval Oceanographic and Atmospheric Sciences Research Laboratory 

Stennis Space Center, MS 39529-5004 ^ ^ ^ 


Prognostic numerical models for atmospheric and oceanic circulations require initial 
fields, boundary conditions, and forcing functions in addition to a consistent set of par- 
tial differential equations, including a state relation and equations expressing conservation 
of mass, momentum and energy. Depending on the horizontal domain to be modeled, the 
horizontal boundary conditions are either physically obvious or extremely difficult to specify 
consistently. If the entire atmosphere is modeled, periodic horizontal boundary conditions 
are appropriate. On the other hand, the physical horizontal boundaries on the entire ocean 
are solid walls. Obviously, the normal velocity at a solid wall is zero while the specification 
of the tangential velocity depends on the mathematical treatment of the horizontal viscous 
terms. Limitations imposed by computer capacity and cost, as well as research interests, 
have led to the use of limited area models to study flows in the atmosphere and ocean. 
The limited area models do not have physical horizontal boundaries, merely numerical ones. 
Correctly determining these open boundary conditions for limited-area numerical models has 
both intrigued and frustrated numerical modelers for decades. 


One common approach is to use the closed or solid wall boundary conditions for a limited- 
area model. The argument given for this approach is that the boundary conditions affect 
flow near the walls but that none of these effects are propagated into the interior. Therefore, 
one chooses a big enough domain that the central region of interest is not corrupted by the 
boundary flow. Research in progress to model the North Atlantic circulation (J. D. Thomp- 
son, private communication) vividly illustrates the pitfalls of this approach. The area covered 
by the Atlantic Ocean model lies between longitudes 0 and 100W and between latitudes 60N 
and 20S with the continental boundaries in place as appropriate and the open water bound- 
aries artificially closed. Two model runs are compared: (A) The southern boundary at 20S 
between latitudes 0 and 40W is artificially closed and (B) the same boundary is specified 
as open with an inward transport of 15 Sv (determined from a global model with the same 
physics) uniformly spread across the boundary. Comparison of runs A and B shows sig- 
nificant differences. For example, the maximum eddy kinetic energy (divided by the mean 
density) is 700 cm 2 j sec 2 in run A while that for run B is 1900 cm 2 j sec 2 . The Gulf Stream 
in run B detaches from the eastern boundary of the United States at the correct latitude 
of approximately 40N while the Gulf Stream in run A never truly flows along the eastern 
boundary of the United States at all. The circulation in the tropics and along the eastern 
boundary of South America also differs radically between the two runs. There are regions 
in the two runs where there is no difference but such regions are small and of little interest, 
i.e. they have very low eddy kinetic energy. These studies and others indicate that the inte- 
rior flow of limited-area models can be dramatically affected by the incorrect use of closed 
boundary conditions. 
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A second common approach is to “nest” the limited-area model inside of another nu- 
merical model which covers a much larger domain. The outer domain model then supplies 
the boundary conditions at the open boundaries of the inner domain or limited-area model. 
As an example, the North Atlantic model described above could have boundary information 
supplied by a global ocean model which has physical, solid walls or closed boundaries. The 
outer domain model usually has a larger time step and coarser mesh size than the inner 
domain model. If the inner and outer domain models are described by the same differen- 
tial equations and assumptions, then the nesting problem is homogeneous. Otherwise, the 
nesting problem is heterogeneous. The nesting is described as two-way if information passes 
from the outer domain to the inner domain and vice-versa. If the outer domain model passes 
information to the inner domain but the inner domain information is not passed into the 
outer domain, then the nesting is one-way. Only one-way nesting with a homogeneous sys- 
tem of numerical models is presented here although future work with two-way (or coupled) 
nesting and with heterogeneous model systems is planned. 

In general, nesting involves two separate problems. The first is the interpolation of 
information from a coarse mesh, outer domain, to a finer mesh, inner domain. The second 
is the modification of the information supplied by the outer domain before it is applied to 
the boundary of the inner domain. Much of the research done to date has not distinguished 
between these two separate problems. 

Linear interpolation is the easiest interpolation method to use. However, linear interpo- 
lation alters the long wavelength information contained in the original fields and adds short 
wavelengths that are not present at all in the original fields. Thus, linear interpolation alters 
the energy distribution of the original fields. To avoid these problems, a variation of the 
resampling method commonly used by engineers in the time-frequency domain (B.E. Eck- 
stein, private communication) has been tested. A fast Fourier transform (D.N. Fox, private 
communication) has been modified so that the output fields, after the inverse Fast Fourier 
Transform, have the required fine grid mesh, although the input fields were supplied on the 
coarse grid mesh. After testing, this technique was modified (A. Wallcraft, private com- 
munication) to handle irregular coastal geometry, which also has to be interpolated. This 
interpolation scheme has been used extensively with the Pacific Basin numerical models to 
avoid the lengthy and expensive new spin-ups required whenever the mesh size is changed. 
(Further discussion of the Pacific Basin research can be found in Hurlburt et al. [1]). 

The effects of changing the mesh size are similar in many ways to those found by changing 
the coefficient of horizontal eddy viscosity, A H . Therefore, in order to avoid interpolation 
effects, the open boundary conditions are studies using models with different coefficients of 
horizontal viscosity. There are three model runs to be considered here. The applied run is 
made with the large outer domain and with a large value of Ajj. The nested run is made 
with the small inner domain and a small value of Ah- The true run is made with the large 
outer domain and with a small value of Ah. The boundary conditions applied on the open 
boundaries of the small domain are taken from the matching grid points on the outer domain 
and “adjusted” as described below. 
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The numerical ocean model used for both the inner and outer domain is a reduced 
gravity, one active layer, primitive equation model with the hydrostatic approximation used. 
The fluid is assumed to be incompressible with uniform density in each layer. The effects 
of the density difference between the two layers is ignored except when multiplied by the 
earth’s gravitational acceleration. The prognostic equations for the horizontal components 
of momentum are written in transport form while the continuity equation is the prognostic 
equation for the layer-depth of the upper, active layer. A spherical coordinate system is used 
and the effects of the earth’s rotation are included. For further details of these equations in 
analytic form, see Hurlburt and Thompson [2]. This ocean model will be referred to as the 
NOARL model. 

The outer domain used is a rectangle. The wind forcing is analytic and drives a, double 
gyre in the ocean model. This choice permits the placement of the inner domain to isolate 
various types of flow: normal or tangential to the open boundary, strong or weak, or flow 
which changes along the open boundary either spatially or temporally (for time-varying 
forcing). The work presented here has only one open boundary, either on the western or 
northern boundary of the inner domain, and the other three boundaries are closed, matching 
the outer domain. 

The NOARL ocean model uses a staggered grid to increase the computational accuracy. 
If solid walls (closed boundaries) are used, then the eastward velocity, u, and the northward 
velocity, v are set to zero along the solid walls. It follows that the eastward transport, 17, 
and the northward transport, V , must be zero also on the solid walls. For solid walls, no 
boundary condition for the layer depth, h, is required, needed. If a boundary is open, 
then initial conditions for all five variables u, v , U, F, and h must be specified to obtain 
a numerical solution- However, arbitrary specification of these five variables on the open 
boundary will in general overspecify the solution. In general and in this research, if the 
inner domain open boundary values are supplied directly from the outer domain with no 
modification or adjustments, the inner domain model will eventually “blow up”, much less 
give the correct solution. 

If the open boundary condition cannot be specified exactly, then the goal is to prevent 
reflections at the open boundary which quickly destroy the interior solutions. Most nesting 
work uses some combination of four basic techniques (Koch and McQueen [3]): blending, 
filtering, damping, and radiation. Damping refers to an increase in the coefficient of eddy 
viscosity near the open boundary. Filtering, which is used in many numerical models without 
open boundaries, is the replacement of a calculated value at a given gridpoint with a weighted 
combination of the calculated value and the surrounding values. Blending is the replacement 
of the calculated prognostic term near the boundary of the inner grid with a combination of 
the prognostic term from the inner grid and that from the outer grid. The radiation technique 
(Sommerfeld [5] and Orlanski [4]) calculates the boundary values, assuming a wave is passing 
through the boundary. The first three techniques tend to destroy the small scale structure 
of the inner grid parameters which defeats the main purpose of running the inner grid with 
increased horizontal resolution. The radiation technique tends to let the waves pass out but 
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is limited by the problem of calculating the phase speed needed. The question arises as to 
how the phase speed should be calculated if there are several types of waves present. 

The goal of this research is to produce a nesting technique which does not destroy the 
inner grid solution or reduce any improvements made in the solution by using the finer grid. 
Therefore, no blending nor any additional damping or filtering has been used on the inner 
domain. The radiation technique has been modified from that used by Sommerfeld [5] and 
Orlanski [4]. The wave equation is used, not with an inner grid variable, but with a new 
variable that is the difference between the inner domain and the outer domain variable, i.e., 
Q(inner ) — Q(outer). The actual open boundary condition used on the open boundary is 
the sum of the outer domain solution and the q found from the wave equation: 

dqjdt f cdq/dn = 0, 

where c is the phase speed and n is the direction normal to the boundary. The phase speed 
used is determined from the mean outflow and the inflow phase speed is set to zero. The 
mass exchange along the boundary is the same for the inner and outer domains. 

The quality of the nesting technique is measured by how well the inner domain solution 
(the nested run) compares with the true run (with the outer domain) solution. This difference 
is compared to the difference between the true outer domain solution and the applied outer 
domain solution. The first tests were done with steady forcing and nearly normal outflow. 
For these cases the differences between the true and nested solution after a year are less than 
five percent of the differences between the true and the applied solutions everywhere except 
for a very small area near part of the open boundary where the values go up to 20%. This 
small portion of the open boundary is where both the non-normal flow is the largest and 
the normal flow reverses sign. Note that this region is confined close to the boundary and 
does not propagate into the interior of the inner domain. Model runs have been extended 
for five years. Although the differences between true and applied runs increase with time, 
the differences between the true and nested runs increase much more slowly. Therefore, the 
percentages cited above actually decrease with longer model runs. 

Ongoing research includes testing open boundaries with non-normal flow, strong jets, 
and reversal of flow with time. Also, the nesting technique is being tested with actual ocean 
models with irregular coastlines included. Specifically, a tropical Pacific Ocean model has 
been nested into a Northern Pacific Basin Model for testing. • 

The results to date include: # 

• Open boundary conditions that can handle both inflow and outflow grid points. 

• Phase speed selection is not crucial for regime tested. 

• Horizontal interpolation is more critical than temporal interpolation. 
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• Five-year nested model runs have been completed. 

• Strong tangential flows require both modified h and non-normal treatment of phase 
speed. 

• Differences in variable values between true and nested runs are, in general, less than 
5% of those between true and applied runs. 
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Abstract 

This paper details the ongoing work of combining three existing software programs 
into a nested grid oceanography model. The HYPER domain decomposition program 
[8], the SPEM ocean modeling program [5] and a Quasi- Geostrophy model written in 
England are being combined into a general ocean modeling facility. This facility will 
be used to test the viability and the capability of two-way nested grids in the North 
Atlantic. 


1 Introduction 

We are beginning work on a basin wide coarse grid overlaid with finer grids that follow 
major mesoscale and dynamic features in the North Atlantic basin. The grid management 
will be handled by the HYPER domain decomposition program [8], We will consider several 
combinations of solution methods to be used including nesting a primitive equation fine mesh 
solution method within another primitive equation coarser grid solution, and a primitive 
equation fine mesh solution within a coarser quasi-geostrophic model solution. 

It is well known that to refine the entire coarse mesh in space for ocean circulation 
modeling would be inefficient; it would require large amounts of memory and waste processor 
time in quasi-geostrophic regions. In short, refining the entire coarse mesh is overkill. For 
explicit time-evolution solution methods of primitive equations the advancement must also 
be severely refined in time to account for the gravity wave stability constraint. This results 
in an excessive number of time steps. Alternatively, an implicit solution on a fully refined 
mesh results in a very large matrix problem. 

We are attacking two areas of fundamental ocean modeling directly. The efficiency of 
the boundary conditions between quasi-geostrophic and primitive equation models should be 
advanced based on the insight acquired from our hierarchical approach to the nesting exper- 
iments. The second fundamental area is ocean modeling in general. Nested basin/regional 
grids are a new concept for ocean applications, and in this respect oceanographic modeling 
lags behind atmospheric and aerodynamic modeling. But the success of domain decomposi- 
tion in these more advanced modeling areas provides encouragement that our research efforts 
are timely, central and well directed towards new, successful applications. 


2 The Navier-Stokes Equations on a Rotating Sphere 

To examine the Navier-Stokes equations on a rotating sphere in a rotating reference frame 
i2, let I denote an inertial reference frame, and let r be a radius of that sphere. Then 

(n)| / = ( r t)|fi T ^ x r 
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where r t — u is velocity and the second term on the right-hand side of the equation is the 
motion a non-rotating observer would see because of the rotation of the sphere. Then 

(uj)t|/ = (u R )t\ R + 2H x u R + fi x (Q x r) + x r. 

On the right-hand side of this equation the second term is the Coriolis acceleration, the third 
term is the centripetal acceleration, and the fourth term is the acceleration resulting from 
any changes in the rotation speed. 

For geophysical applications here on Earth this last acceleration term is discarded except 
for very long time scales, and centripetal acceleration can be expressed as a potential, 

n x (fl x r) = — V0 C , 

that then can be added to the gravitational force potential to net a new geophysical force 
potential. 

The total or material derivative of a scalar quantity is the same in both reference frames. 
Hence the form of the conservation of mass and the thermodynamic equations remains the 
same. 

To estimate the frictional forces F, we could assume a Newtonian fluid with a symmetric 
Navier-Stokes internal pressure tensor. But this molecular dissipative strength would have 
an unknown relationship to the dissipative strength of a given mesoscale ocean phenom- 
ena. In general, a qualitative description of the transfer of energy and momentum between 
scales of interest, and not these smaller molecular scales, are parameterized based on known 
qualitative ocean behavior. 

» 

3 Governing Equations 

• 

Coupling hydrodynamics and thermodynamics, consider an adiabatic, inviscid fluid. It can 
be described by conservation of momentum, a continuity equation, and an energy equation 
coupled with an equation of state. That is, 

■^-u -f ftV P + F = 0 

at 

d 

— a — ttV • u — 0 

dt 

^-P + P7V • u = 0 

dt 

where ti 6 fl 3 is the velocity, a is the specific volume, P is the pressure, and 7 is the ratio 
of specific heats. Here F is the Coriolis and gravity forces, and ^ = ^-(-'U*Vis the total 
time derivative. 
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3.1 Primitive Equations 

The hydrostatic approximation neglects vertical acceleration in the vertical equation of mo- 
tion; 

dP_ = Zl 
Oz a 

The Boussinesq approximation replaces density with a zeroth order mean density ev- 
erywhere except where multiplied by gravity. The combined hydrostatic and Boussinesq 
approximations are used to formulate a set of reduced equations known as the primitive 
equations. 

These equations are as follows: 

d 

—ujf + aV H P H F u = 0 
DP 

“«7 + s = 0 

i 

— a — a V * u — 0 
at 

d 

— P + P 7 V-u = 0 

at 

where u H = («i ,«*)*, u = (u H ,wy,V H = 

3.2 Geostrophy 

Geostrophy, or geopotential flow, retains only the balance between the Coriolis force and the 
potential field. Geostrophy: 

fv = grj x 

fu = -gijy 

is the first approximation in an asymptotic expansion of the primitive equations. Here rj is 
the variation of the sea surface height, a measure of pressure. 

3.3 Quasi-Geostrophy 

Large-scale ocean movement is typically quasi-geostrophic. Asymptotically, quasi-geostrophic 
motion has time scales not smaller than the advect.ive time scale. It is geostrophic to low- 
est order, yet retains dynamics. Velocity fields can change, but they do so in continuous 
geostrophic balance with pressure. Hence there are temporal derivatives retained. The 
quasi-geostrophic equations are 

-fv = ~gg x 
v t + fu = -g Vy 
T] t + ( uH) x + {vH) y = 0 . 

Here H + g is the mean height, if, plus the variation of thickness 77. 
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4 Related Research 


Spall and Holland in unpublished work nested a primitive equation model within ^ quasi- 
geostrophic model. They found that the quasi-geostrophic boundary conditions seriously 
dampened the primitive equation physics close to the boundary, reducing them to essentially 
quasi-geostrophic physics in the boundary region. Reducing or eliminating this boundary 
layer is the principle focus of our current efforts. 

Thompson and Schmitz [7] varied the damping time scale on outflow boundary conditions 
for a model of the gulf stream. They found that the outflow dynamics and hence the location 
of the Gulf stream are significantly impacted by the outflow boundary conditions. With 
such strong impact, the possibility of numerical artifacts in regional models due to boundary 
conditions seems large. The lack of existence of well-posed boundary conditions for primative 
equations complicates this problem because no comparisons with true boundary conditions 
can be made even as approximations [7]. See the article by Dr. Blake in this proceedings for 
more detailed information. 

This leads to the importance of getting the boundary conditions physically correct. It is 
known that the subcharacteristics of the Euler equations, upon which the primitive equations 
are based, can have combined inflow and outflow characteristics at both advective inflow and 
outflow boundaries, dependent on the sound speed. Hence the dynamics of the refined regions 
typically has an affect on the surrounding flow fields. Using one-way boundary conditions 
for inflow and one-way for outflow is not, in general, sufficient. There must be a stronger 
interaction between the dynamics of the coarse and the refined meshes. 

To strengthen the interaction Spall and Holland [9] added a direct, but averaged, insertion 
of the streamfunction field (generated using the refined primitive equation depth-averaged 
horizontal velocity values on the refined mesh) onto the coarser quasi-geostrophic solution. 
They allow the refined mesh to dictate the regional external flow component. 

Our hypothesis is that the external flow is insufficient. The strong dynamics in the gulf 
stream are forced by internal instabilities. We are currently testing both baroclinic and 
barotropic nudging for all prognostic variables to establish a stronger relationship between 
the regional dynamics and the coarse mesh solution behavior. This will be done while 
maintaining Spall and Hollands quasi-geostrophic to primitive equation boundary conditions 
[9], which are barotropic, to see if we can improve upon their results. If we are successful, 
we will experiment with a quasi-geostrophic coarse mesh solution and try to find more 
comprehensive two-way communication techniques that reduce or eliminate the boundary 
layer formation found in the previous work. 

We are currently using the barotropic modon defined in [9], and a baroclinic vortex 
problem. 

We are working with flat topography at first, so that we can combine either fixed or 
sigma (stretched) vertical coordinate models. 
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5 Initial Boundary Value Problems 

For modeling purposes these equations must, of course, be viewed as initial boundary-value 
problems (IBVP). Computational models use 1I1VP for hindcasts, nowcasts, and forecasts 
as well as for physical studies that examine phenomena of interest such as energy cascades, 
eddy shedding, and coastal upwelling. There are two prevalent boundary conditions used in 
ocean models: rigid walls and open boundary conditions. The trick, of course, is to find open 
boundary conditions that are well posed and yet not overconstrained. Because one cannot 
simultaneously diagonalize the coefficient matrices for the multidimensional advective terms 
of the Navier-Stokes equations, this is often a time-consuming guessing game. 

To define open boundary conditions in general, define a system of equations 

Lu — F 

with given initial conditions 

u(z : t — 0) — u 0 (x) 
and characteristic boundary conditions 

u l — Su 2 T g 

where u = (w 1 ,^ 2 )* are the prognostic variables, and S is a generalized reflection operator 

W. _ _ 

But for reduced equations there can also be modeling constraints such as hydrostatic 
balance or incompressibility. The modeling constraints assumed in order to reduce the 
equations (that is, the asymptotic balances chosen) must be enforced on the initial conditions 
and the boundary conditions to avoid introducing inappropriate length and time scales. 

Many obvious well-posed boundary conditions are overspecified, which leads to the for- 
mation of boundary layers within which the solution adjusts to the additional information. 

5.1 Open Boundary Conditions , 

The major issue to address is boundary conditions. Oliger and Sundstrom in [4] detail some 
boundary treatment for geophysical problems, and show that point-wise local boundary con- 
ditions for the primitive equations are not well-posed: the regional open boundary problem 
is open-ended. It is not known, however, whether non-local boundary conditions, such as 
those generated with a domain decomposition method where boundary conditions that are 
derived from a larger domain are or arc not well-posed. We may have fewer pro ble ms with 
open boundary conditions at a two-sided boundary. The quasl-geostrophic boundary layer 
within the nested primitive equation model in the unpublished work of Spall and Holland 
indicate that open boundary conditions on a nested grid is a major problem that must be 

addressed before two-way nesting will be successful. ; ; .■ 

Our approach has been to work from the primitive equation homogeneous model back- 
wards to the quasi-geostrophic model, assessing the differences in the information transmit- 
ted across boundaries, to derive better open boundary conditions for the simplier quasi- 
geostrophic model. 
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For the heterogeneous boundary conditions between the primitive equation and quasi- 
geostrophic regions the quasi-geostrophic boundaries need to evolve as if there was primitive 
equation physics in the region surrounding the refined domain. To insure this we are moni- 
toring on the boundary, where quasi-geostrophic physics, which is less vertically diverse, is 
statically stable, and comparing the evolving quasi-geostrophic boundary against a fine mesh 
primitive equation global solution. This is one of our measures of error that is physically 
based. 


0 HYPER 

The HYPER program, described in Perkins [5], looks at domain decomposition as a tool to 
combine grids for computational efficiency and for model flexibility. It currently can locate 
where refined grids should be placed based on asymptotic and-or physical criteria and it 
initializes the refined grids using local uniform mesh refinement. 

Our current work is a static domain decomposition; we are running experiments on the 
influence of the internal boundaries on the flow pattern, and are interested in the flow through 
that boundary. These results are currently being prepared and will appear in a later report. 

The goal is to follow different asymptotic regimes within the ocean basin that are iden- 
tifiable as distinct physical regions of the ocean. The problem is that, once you’re inside 
a reduced physics region, such as a quasi-geostrophic region where there is no ageostrophic 
flow, there may be no way for the model to evolve the complete physics you hope to recover 
by using the refined meshes. For example, in the gulf stream region meanders pinch off to 
form eddies. Many aspects of the physics contribute to this pinching off. In such cases the 
reduced quasi-geostrophic model will not reproduce the ageostrophic behavior in the initial 
conditions of the refined mesh, and hence will miss some of the time dependent interac- 
tions that contribute to the dynamically significant event of ring shedding. There will be no 
ageostrophic signals in the initial conditions to interpolate onto the refined mesh. In such a 
situation, the data used to help initialize the model may make up for some of the missing 
physics, but the time scales may be off. 

7 Domain Decomposition 

Domain decomposition allows the mesh to evolve with the solution. It has been applied to 
elliptic and hyperbolic equations for several years. The interested reader can refer to Chan, 
et al., [2] for an extensive bibliography on elliptic and hyperbolic domain decomposition 
methods. 

Formally we describe the domain decomposition of a discrete coarse mesh, ft 1 , on the 
computational mesh ft, for a fixed time t = nAt , by letting (p(f) — 1) be the number of 
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refined subdomains used at time t and be those subdomains: 

p( 0 

u c d . 

itri 

After the domain is decomposed, local uniform spatial mesh refinement, as developed by 
Berger [1], is applied to the new subdomains. The time step may also be refined on these 
regions. 

The sequencing for one coarse time step of magnitude A c t from time t n to time t n + A c t } 
where n indexes the discrete time steps on the coarse mesh, is presented next. Let the 
temporal refinement ratio from the coarse mesh to the refined meshes be r, and notate this 
rAjt — A c i, so that a subscript “c” informs us that we are on the coarse mesh, and a 
subscript “/” informs us that we are discussing one of the refined meshes. The domain 
decomposition algorithm follows: 

Domain Decomposition Algorit hm 

Advance coarse mesh 

Mark points with significant mesoscale and ageostrophic potential 

Cluster these points into refined meshes 

DO r times 

Solve equations on refined meshes 

ENDDO 

Nudge refined values onto coarse mesh 


When all of the refined meshes have been advanced r refined time steps to the next coarse 
time step, their values at time t n ^ 1 are passed to the coarse mesh where a nudging technique 
modifies the coarse advanced values and produces an aggregate solution on the coarse mcslu 
Let F A , F\ and {F k } p k { *\ represent the discrete operators for the aggregate solution on the 
original discretized mesh ft 1 , the solution on the coarse mesh ft 1 , and the separate solutions 
on each of the refined subdomains {ft* » respectively. Then the aggregate solution on the 
coarse discretized mesh is given by 

F' , (n')=c[{F‘(!!*)}jW,F'(!J')] . (74) 

where the operator C Is a nudging technique that may vary between experiments. 

We use domain decomposition as a tool to combine the explicit coarse mesh solution 
method with the refined mesh solution methods to satisfy our varying numerical requirements 
in a computationally efficient way. Wc use a two-level refinement scheme consisting of one 
coarse mesh and a set of overlaid refined meshes, where the coarse mesh adequately represents 
quasi-geostrophic behavior, while the refined meshes adequately resolves the more physically 
complete primitive equations. 
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Refined to coarse mesh communication (feedback) can take the form of value averaging, 
as in Berger [1] and as in Spall and Holland [6]. 

We are using a nudging data assimilation technique for the initial experiment and we do 
not include explicit conservation enforcement. 

Our domain decomposition work focuses on two-way interactive nested grid communi- 
cations and the development of good internal boundary conditions. We are particularly 
interested in examining heterogeneous open boundary conditions between different Asymp- 
totic regions. Because the major geophysical equations are not known to be well-posed as 
an initial value boundary problem, in general, this issue becomes important. * 


8 Initial Conditions 

Our long range plans are to build a basin wide grid, and overlay it with refined grids about 
regions of ageostrophic dynamic regions. The refined grids will then follow mesoscale or 
planetary scale dynamical features. 

But our current work is much less ambitious. We have constructed a box model and are 
using Spall and Hollands [12] barotropic modon and baroclinic vortex problems to examine 
the viability and desirability of different communication schemes between the coarse and 
refined meshes. 

A barotropic modon is a coherent, concentric streamfunction. The barotropic flow is the 
primary mode of a quasi-geostrophic equation formulated as a Sturm-Liouville problem (all 
other modes of the Sturm-Liouville problem are referred to as baroclinic). It has an analytic 
solution, it is quasi-geostrophic, and uses an infinite beta plane approximation. The result 
is a coherent depth independent (barotropic) structure that moves at constant speed. 

The baroclinic vortex has no analytic solution, and is defined using a Gaussian pressure 
distribution with maximum geos trophic velocity of 100 cm/ sec. The initial velocity fields are 
calculated to be in geostrophic balance with the prescribed Gaussian pressure field. 

Our experiment follows a hierarchical approach. Beginning with a homogeneous domain 
decomposition we use a full, coarse primitive equation model and keep track of the flow 
across the “future” internal boundaries. Then we introduce the nested grid into the same 
problem and analyze any changes in the boundary information flow. This is used as our 
error due to boundary conditions only. This error is measured both in root mean squared 
error and in phase error. Small shifts of mesoscale features are not always bad compared to 
changes in dynamics within those mesoscale features. 

Once we complete our homogeneous studies we will move to a heterogeneous domain 
decomposition with a quasi-geostrophic coarse grid overlaid with a primitive equation refined 
grid. Again we will compare the flow across the internal boundaries. Then we will add a 
feedback loop that uses the nudging data assimilation technique from the refined mesh to 
the coarse mesh and nudge to the true boundary information. This way we can measure the 
improvement due to the nudging feedback loop. 

The quasi-geostrophic coarser model will be forced by the nudging from the refined prim- 
itive equation model. The unforced quasi-geostrophic model is statically stability, but the 
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primitive equation is not. The forced behavior perturbs the reduced equation dynamics, so 
that a time series of its behavior in the refined region will not be statically stability due 
to the feedback interaction. But with known density changes at the boun dari es from our 
true solution, we can measure how well the reduced dynamics are being influenced by the 
regional models with their more complete physical models. Another metric is the apparent 
ageostrophic time series behavior in the quasi-geostrophic coarser model. After nudging, 
the ageostrophic adjustment to the quasi-geostrophic coarse mesh is calculated, and a time 
series of this difference is the ageostrophic forcing of the quasi-geostrophic model. Where 
this difference is small, there is no need to maintain a refined mesh, so this metric can be 
used to eliminate refined meshes that are no longer needed, but it can not help us locate 
where refined meshes should be placed. 

8.1 Semi-Spectral Primitive Equation Model (SPEM) 

The primitive equation SPEM model of Ilaidvogel et. al. [3] has prognostic variables for 
horizontal velocity, u and v , and temperature t, and salinity s. It uses the hydrostatic and 
Boussinesq approximation. The resulting equations are advanced on an scattered Arakawa 
“C” grid in the horizontal, while the vertical is spectral, with Chebyshev modes. It has a 
rigid lid approximation at the surface (no variations or “waves” in sea surface height). 


9 Current Summary 

The computational demands of fully three-dimensional global ocean modeling seem to re- 
quire a nested heterogeneous adaptive grid solution. However, the implementation difficulties 
are robust. The need for physically realistic open boundary conditions is already well doc- 
umented, mostly a result of a “grand challenge” issued several years ago. Our experience 
indicates that an equally pressing need is to provide modeling-consistent asymptotically 
nested initial conditions for each new nested grid. 

The scientific aspects of the work are focused on the boundary condition formulation and 
on the two-way grid communication mechanisms under development. 
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Jeffrey S. Scroggs 
Department of Mathematics 
North Carolina State University 
Raleigh, North Carolina 
and 

A. Louise Perkins 

Department of Earth, Atmospheric and Planetary Sciences 
Massachusetts Institute of Technology 
Cambridge, Massachusetts 


1 Introduction 

The activity in research of domain decomposition for the numerical solution of partial dif- 
ferential equations is increasing at a rapid pace. One motivation for domain decomposition 
is the isolation of physical phenomenae into separate subdomains. The numerical treatment 
(and possibly even the modeling equations) may be different in these subdomains. Thus, 
this style of domain decomposition is heterogeneous . 

This bibliography includes references to works central to the development of heteroge- 
neous domain decomposition. 

The bibliography is by no means complete. Indeed, we would be delighted to receive 
additional references to add to the bibliography. 


2 Adding to this Bibliography 

We would like additional references that are core to the topic of heterogeneous domain 
decomposition. Please use the following guidelines: 


Format 


Medium 


Bibliographic data should be in bibtex format. A set of 
keywords is requested as part of the format. These keywords 
will be used to form the index portion of this database. 

Email messages to Louise Perkins or Jeff Scroggs. 


Assistance To assist with placing the data in bibtex format, send a 
request to either of us for the C program BIB INPUT. This 
program will interactively prompt you for the data, and 6 
produce a file with the formatted entries. 


70 


Disclaimer 


We are trying to keep this bibliography focused, hence 
submissions that do not obviously deal with heterogeneous 
domain decomposition will be eliminated. 


Please send your references to 

Louise Perkins 

54-1420 

MIT 

Cambridge, MA 02139 
(617)253-1291 

perkins@pimms.mit.edu 


Jeffrey S. Scroggs 
Box 8205 

Department of Mathematics 
North Carolina State University 
(919)737-7817 
scroggs@matjfs.ncsu.edu 
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